I'm creating a fictive map and so I need to create lots of points, lines and for sure polygons. Later on I export my data as geojson. But before that I aways have to go and give every element an unique ID.
I don't need a special sorting, like the biggest polygon gets the smallest ID or so. I just need all polygons with an ID at the end, without doing it manually like I have to do now.
Would be great if someone knows how to do that.
Using the field calculator is the way to go:
No ID was given in
- Digitize every features without any entering any Id.
- Before export, update unique Ids with the expression, '$Id' using the field calculator.
Some ID's already given in
- If you have already ID's you can use '-$Id'. Make sure you just select new Features what means that that are 'NULL' in the id row. Simply do that by ordering the column.
- Now do the steps from the pictures:
Hallelujah! Or Eureka. Or whatever. This can be done. With a shapefile.
- If there isn't already one, add a field to contain the feature id, say "FID", of type Whole number (integer).
- Open Layer Properties (right-click on the layer and choose Properties… or double-click the layer), click on the Attributes Form tab, then under General uncheck Editable and under Defaults in the field Default value type
maximum("FID") + 1.
By unchecking Editable, you can't enter another value or delete what's there. Note that if there are values without an ID, these values won't be updated. At some point I'll experiment with checking Apply default value on update and revising my formula to check for a zero or NULL value to update only those records when they are edited, not any record with a value greater than 1. (Earlier in this post it was discussed how to update the FID field with unique values, which you will need to do if you added the field after there were already features in the shapefile.)
Note that this is saved with the current map file, not the shapefile, so adding that shapefile multiple times will require you to copy that part of the layer style to the newly added layer. To do this, right-click on the layer, choose Styles > Copy Style > Fields, and right-click on another layer, choose Styles > Paste Style > All Style Categories (or continue to Fields). You can also copy that part of the style to any other layer based on a shapefile, but the ID field must have the same name as the layer you're copying from.
I would like to add to vinayan's post and briefly mention the rownum function, as it is very similar and in some cases might be a little more convenient.
id returns the Feature ID, meaning that it always starts at zero.
rownum returns the number of the row, meaning that it starts at one.
So, basically, if you want the auto-increment to start at 0 go for $id, and if you want it to start at 1 then go for $rownum.
Update for QGIS 3
I know I am quite late to this but always good to give any updates:
In QGIS 3 there is now a native tool which can be used to do this exact job and it is called "Add autoincremental field"
No need to use an expression in the field calculator or do any coding but nevertheless these are all still very useful and good to know.
This topic has come up here: Create Shapefile with auto increment primary key in QGIS
My suggestions would be:
1) SQLITE / SpatialLite databases support auto-incrementing on a field set to INTEGER PRIMARY KEY:
On an INSERT, if the ROWID or INTEGER PRIMARY KEY column is not explicitly given a value, then it will be filled automatically with an unused integer, usually the one more than the largest ROWID currently in use. This is true regardless of whether or not the AUTOINCREMENT keyword is used.
Each time you edit/create polygons, you can fill out their attributes, and SQLITE will give it an incremental unique value in the field you have set to INTEGER PRIMARY KEY type.
When you're ready to export to GEOJSON, you're all set with your UNIQUE ID's.
2) If using Shapefiles, create an OBJECTID field of INTEGER type and use a field calculator expression to populate that field each time you edit/create polygons and need to export them. You will lose the original ID a polygon once had, but you this is the only way to achieve this using .SHP. (I will have to find the field calculator expression).
PostGIS is another data source you might want to explore, though more of a heavy lift than SQLITE, you might find value in such a system as you move forward…
Old post but for anyone else looking for a speedy solution mine was to create a field with $ID + 1 and it will automatically generate starting with 1!
If you don't need something humanly digestlible, there is now an easy fix: In the field propreties, select "UUID Generator", and leave everything blank.
This will automatically create a UUID in the field. Not as friendly as a simple number (as per $id or $rownum), but generates the UUID right from the start, so no successive steps.
you can just delete the first colmn (id) and create new one "As virtual field"
Simply copy the files and create a CSV copy and display again creating OIDs and export as new shapefile or feature class.
these solutions no longer worked for me in QGIS 2.01 Dufour. Typing
$idon a new or existing field named 'id' in the expression input field gave me an error "Expression is invalid"
What did work was to type the function
$rownumand then click "OK"
The easiest way to do this would be probably with a python script or maybe it's possible with the field calculator. Sorry I don't have one for you, maybe someone else will. In the meantime, I'd search for a python script for it. I've seen a lot about this for ArcGIS, but I'm sure there's something out there for QGIS.
Issue with Auto increment in MYSQL
I want to keep JOB_NAME , SANDBOX , PARENT_JOB_NAME as Primary KEY and JOB_ID as Auto Increment because I am using "ON DUPLICATE KEY UPDATE" and because of auto-increment it is inserting new rows and creating duplicate in table.
And while removing job_id from primary key. I am getting error as "Incorrect table definition there can be only one auto column and it must be defined as a key"
17 Answers 17
It's never a bad idea to have a guaranteed unique row identifier. I guess I shouldn't say never &ndash but let's go with the overwhelming majority of the time it's a good idea.
Theoretical potential downsides include an extra index to maintain and extra storage space used. That's never been enough of a reason to me to not use one.
TLDR: Use UUID's instead of auto-increment, if you don't already have a unique way of identifying each row.
I disagree with all the answers before. There are many reasons why it is a bad idea to add an auto increment field in all tables.
If you have a table where there are no obvious keys, an auto-increment field seems like a good idea. After all, you don't want to select * from blog where body = '[10000 character string]' . You'd rather select * from blog where >. I'd argue that in most of these cases, what you really want is a unique identifier not a sequential unique identifier. You probably want to use a universally unique identifier instead.
There are functions in most databases to generate random unique identifiers ( uuid in mysql, postgres. newid in mssql). These allow you to generate data into multiple databases, on different machines, at any time, with no network connection between them, and still merge data with zero conflicts. This allows you to more easily setup multiple servers and even data centers, like for example, with microservices.
This also avoids attackers guessing url's to pages they shouldn't have access to. If there's a https://example.com/user/1263 there's probably a https://example.com/user/1262 as well. This could allow automation of a security exploit in the user profile page.
There are also a lot of cases where a uuid column is useless or even harmful. Let's say you have a social network. There is a users table and a friends table. The friends table contains two userid columns and an auto-increment field. You want 3 to be friends with 5 , so you insert 3,5 into the database. The database adds an auto-increment id and stores 1,3,5 . Somehow, user 3 clicks the "add friend"-button again. You insert 3,5 into the database again, the database adds an auto-increment id and inserts 2,3,5 . But now 3 and 5 are friends with each other twice! That's a waste of space, and if you think about it, so is the auto-increment column. All you need to see if a and b are friends is to select for the row with those two values. They are, together, a unique row identifier. (You would probably want to do write some logic to make sure 3,5 and 5,3 are deduplicated.)
There are still cases where sequential id's can be useful, like when building an url-shortener, but mostly (and even with the url shortener) a randomly generated unique id is what you really want to use instead.
Autoincemental keys have mostly advantages.
But some possible drawbacks could be:
- If you have a business key, you have to add a unique index on that column(s) too in order to enforce business rules.
- When transfering data between two databases, especially when the data is in more than one table (i.e. master/detail), it's not straight-forward since sequences are not synced between databases, and you'll have to create an equivalence table first using the business key as a match to know which ID from the origin database corresponds with which ID in the target database. That shouldn't be a problem when transfering data from/to isolated tables, though.
- Many enterprises have ad-hoc, graphical, point-and-click, drag-and-drop reporting tools. Since autoincremental IDs are meaningless, this type of users will find it hard to make sense of the data outside "the app".
- If you accidentally modify the business key, chances are you will never recover that row because you no longer have something for humans to identify it. That caused a fault in the BitCoin platform once.
- Some designers add an ID to a join table between two tables, when the PK should simply be composed of the two foreign IDs. Obviously if the join table is between three or more tables, then an autoincremental ID makes sense, but then you have to add an unique key when it applies on the combination of FKs to enforce business rules.
Here's a Wikipedia article section on the disadvantages of surrogate keys.
Just to be contrary, No, you do NOT need to always have a numeric AutoInc PK.
If you analyse your data carefully you often identify natural keys in the data. This is often the case when the data has intrinsic meaning to the business. Sometimes the PKs are artefacts from ancient systems that the business users utilize as a second language to describe attributes of their system. I've seen vehicle VIN numbers used as the primary key of a "Vehicle" table in a fleet management system for example.
However it originated, IF you already have a unique identifier, use it. Don't create a second, meaningless primary key it's wasteful and may cause errors.
Sometimes you can use an AutoInc PK to generate a customer meaningful value e.g. Policy Numbers. Setting the start value to something sensible and applying business rules about leading zeros etc. This is probably a "best of both worlds" approach.
When you have small numbers of values that are relatively static, use values that make sense to the system user. Why use 1,2,3 when you could use L,C,H where L,H and C represent Life, Car and Home in an insurance "Policy Type" context, or, returning to the VIN example, how about using "TO" for Toyota? All Toyata cars have a VIN that starts "TO" It's one less thing for users to remember, makes it less likely for them to introduce programming and user errors and may even be a usable surrogate for a full description in management reports making the reports simpler to write and maybe quicker to generate.
A further development of this is probably "a bridge too far" and I don't generally recommend it but I'm including it for completeness and you may find a good use for it. That is, use the Description as the Primary Key. For rapidly changing data this is an abomination. For very static data that is reported on All The Time, maybe not. Just mentioning it so it's sitting there as a possibility.
I DO use AutoInc PKs, I just engage my brain and look for better alternatives first. The art of database design is making something meaningful that can be queried quickly. Having too many joins hinders this.
EDIT One other crucial case where you do not need an Autogenerated PK is the case of tables that represent the intersection of two other tables. To stick with the Car analogy, A Car has 0..n Accessorys, Each Accessory can be found on many cars. So to represent this You create a Car_Accessory table containing the PKs from Car and Accessory and other relevant information about the link Dates etc.
What you don't (usually) need is an AutoInc PK on this table - it will only be accessed via the car "tell me what accessories are on this car" or from the Accessory "tell em what cars have this accessory"
Many tables already have a natural unique id. Do not add another unique id column (auto-increment or otherwise) onto these tables. Use the natural unique id instead. If you add another unique id, you essentially have a redundancy (duplication or dependency) in your data. This goes against the principles of normalization. One unique id is dependent on the other for accuracy. This means that they have to be kept perfectly in sync at all times in every system that manages these rows. It's just another fragility in your data integrity that you don't really want to have to manage and validate to long term.
Most tables these days don't really need the very minor performance boost that an additional unique id column would give (and sometimes it even detracts from performance). As a general rule in IT, avoid redundancy like the plague! Resist it everywhere it is suggested to you. It's anathema. And take heed of the quote. Everything should be as simple as possible, but not simpler. Don't have two unique ids where one will suffice, even if the natural one seems less tidy.
On larger systems, ID is consistency booster, do use it almost anywhere. In this context, individual primary keys are NOT recommended, they are expensive at the bottom line (read why).
Every rule has an exception, so you might not need integer autoincrement ID on staging tables used for export/import and on similar one-way tables or temporary tables. You would also prefer GUID's instead of ID's on distributed systems.
Many answers here suggest that existing unique key should be taken. Well even if it has 150 characters? I don't think so.
Now my main point:
It looks that opponents of autoincrement integer ID are speaking about small databases with up to 20 tables. There they can afford individual approach to each table.
BUT once you have an ERP with 400+ tables, having integer autoincrement ID anywhere (except cases mentioned above) just makes great sense. You do not rely on other unique fields even if they are present and secured for uniqueness.
- You benefit from universal time-saving, effort-saving, easy-to-remember convention.
- In most cases you JOIN tables, without need of checking what the keys are.
- You can have universal code routines working with your integer autoincrement column.
- You can extend your system with new tables or user plugins not foreseen before simply by referring to ID's of existing tables. They are already there from the beginning, no costs to add them additionally.
On larger systems, it can be worth ignoring minor benefits of those individual primary keys and consistently use integer autoincrement ID in most cases. Using existing unique fields as primary keys is maybe saving some bytes per record but additional storage or indexing time pose no issue in today's database engines. Actually you are losing much more money and resources on wasted time of the developers/maintainers. Today's software should be optimized for time and effort of programmers – what approach with consistent ID's fulfills much better.
It is not good practice to superfluous designs. I.e. - it is not good practice to always have an auto increment int primary key when one is not needed.
Let's see an example where one is not needed.
You have a table for articles–this has an int primary key id , and a varchar column named title .
You also have a table full of article categories– id int primary key, varchar name .
One row in the Articles table has an id of 5, and a title "How to cook goose with butter". You want to link that article with the following rows in your Categories table: "Fowl" (id: 20), "Goose" (id: 12), "Cooking" (id: 2), "Butter" (id: 9).
Now, you have 2 tables: articles and categories. How do you create the relationship between the two?
You could have a table with 3 columns: id (primary key), article_id (foreign key), category_id (foreign key). But now you have something like:
A better solution is to have a primary key that is made up of 2 columns.
This can be accomplished by doing:
Another reason not to use an auto increment integer is if you are using UUIDs for your primary key.
UUIDs are by their definition unique, which accomplishes the same thing that using unique integers does. They also have their own added benefits (and cons) over integers. For instance, with a UUID, you know that the unique string you're referring to points to a particular data record this is useful in cases where you do not have 1 central database, or where applications have the ability to create data records offline (then upload them to the database at a later date).
In the end, you need to not think about primary keys as a thing. You need to think of them as the function they perform. Why do you need primary keys? To be able to uniquely identify specific sets of data from a table using a field that will not be changed in the future. Do you need a particular column called id to do this, or can you base this unique identification off of other (immutable) data?
Or are there scenarios where you don't want to add such a field?
First of all, there are databases that have no autoincrements (e.g., Oracle, which certainly is not one of the smallest contenders around). This should be a first indication that not everybody likes or needs them.
More important, think about what the ID actually is - it is a primary key for your data. If you have a table with a different primary key, then you do not need an ID, and should not have one. For example, a table (EMPLOYEE_ID, TEAM_ID) (where each employee can be in several teams concurrently) has a clearly defined primary key consisting of those two IDs. Adding an autoincrement ID column, which is also be a primary key for this table, would make no sense at all. Now you are lugging 2 primary keys around, and the first word in "primary key" should give you a hint that you really should have only one.
I usually use an "identity" column (auto-incremennting integer) when defining new tables for "long-lived" data (records I expect to insert once and keep around indefinitely even if they end up "logically deleted" by setting a bit field).
There are a few situations I can think of when you don't want to use them, most of which boil down to scenarios where one table on one instance of the DB cannot be the authoritative source for new ID values:
- When incremental IDs would be too much information for a potential attacker. Use of an identity column for "public-facing" data services makes you vulnerable to the "German Tank Problem" if record id 10234 exists, it stands to reason that record 10233, 10232, etc exist, back to at least record 10001, and then it's easy to check for record 1001, 101 and 1 to figure out where your identity column started. V4 GUIDs composed of mainly random data break this incremental behavior by design, so that just because one GUID exists, a GUID created by incrementing or decrementing a byte of the GUID does not necessarily exist, making it harder for an attacker to use a service indtended for single-record retrieval as a dump tool. There are other security measures that can better restrict access, but this helps.
- In M:M cross-reference tables. This one's kind of a gimme but I've seen it done before. If you have a many-to-many relationship between two tables in your database, the go-to solution is a cross-reference table containing foreign key columns referencing each table's PK. This table's PK should virtually always be a compound key of the two foreign keys, to get the built-in index behavior and to ensure uniqueness of the references.
- When you plan on inserting and deleting in bulk on this table a lot. Probably the biggest disadvantage to identity columns is the extra hoopla you have to go through when doing an insert of rows from another table or query, where you want to maintain the original table's key values. You have to turn "identity insert" on (however that's done in your DBMS), then manually make sure the keys you're inserting are unique, and then when you're done with the import you have to set the identity counter in the table's metadata to the maximum value present. If this operation happens a lot on this table, consider a different PK scheme.
- For distributed tables. Identity columns work great for single-instance databases, failover pairs, and other scenarios where one database instance is the sole authority on the entire data schema at any given time. However, there's only so big you can go and still have one computer be fast enough. Replication or transaction log shipping can get you additional read-only copies, but there's a limit to that solution's scale as well. Sooner or later you'll need two or more server instances handling inserts of data and then synchronizing with each other. When that situation comes, you'll want a GUID field instead of an incremental one, because most DBMSes come pre-configured to use a portion of the GUIDs they generate as an instance-specific identifier, then generate the rest of the identifier either randomly or incrementally. In either case, the odds of a collision between two GUID generators are nil, while an identity integer column is a nightmare to manage in this situation (you can go even/odd by offsetting seeds and setting the increment to 2, but if one server sees more activity than the other you're wasting IDs).
- When you have to enforce uniqueness across multiple tables in the DB. It's common in accounting systems, for instance, to manage the General Ledger (with a row for each credit or debit of every account that has ever occurred, so it gets very big very quickly) as a sequence of tables each representing one calendar month/year. Views can then be created to hook them together for reporting. Logically, this is all one very big table, but chopping it up makes the DB's maintenance jobs easier. However, it presents the problem of how to manage inserts into multiple tables (allowing you to begin logging transactions in the next month while still closing out the last) without ending up with duplicate keys. Again, GUIDs instead of identity integer columns are the go-to solution, as the DBMS is designed to generate these in a truly unique way, so that a single GUID value will be seen once and only once in the entire DBMS.
There are workarounds that allow use of identity columns in these situations, as I've hopefully mentioned, but in most of these, upgrading from the identity integer column to a GUID is simpler and solves the problem more completely.
An auto-incremented (identity) primary key is a good idea except to note that it is meaningless outside of the context of the database and immediate clients of that database. For example, if you transfer and store some of the data in another database, then proceed to write different data to both database tables, the id's will diverge - i.e., data with an id of 42 in one database won't necessarily match the data with an id of 42 in the other.
Given this, if it's necessary to still be able to identify rows uniquely outside of the database (and it frequently is), then you must have a different key for this purpose. A carefully selected business key will do, but you'll often end up in a position of a large number of columns required to guarantee uniqueness. Another technique is to have an Id column as an auto-increment clustered primary-key and another uniqueidentifier (guid) column as a non-clustered unique key, for the purposes of uniquely identifying the row wherever it exists in the world. The reason you still have an auto-incremented key in this case is because it's more efficient to cluster and index the auto-incrementing key than it is to do the same to a guid.
One case where you might not want an auto-incrementing key would be a many-to-many table where the primary key is a compound of the Id columns of two other tables (you could still have an auto-incrementing key here, but I don't see the point of it).
One other question is the datatype of the auto-incremented key. Using an Int32 gives you a large, but relatively limited range of values. Personally I frequently use bigint columns for the Id, in order to practically never need to worry about running out of values.
As other people have made the case for an incrementing primary key I will make one for a GUID:
- It is guaranteed to be unique
- You can have one less trip to the database for data in your application. (For a types table for instance you can store the GUID in the application and use that to retrieve the record. If you use an identity you need to query the database by name and I have seen many an application that does this to get the PK and later queries it again to get the full details).
- It is useful for hiding data. www.domain.com/Article/2 Lets me know you only have two articles whereas www.domain.com/article/b08a91c5-67fc-449f-8a50-ffdf2403444a tells me nothing.
- You can merge records from different databases easily.
- MSFT uses GUIDS for identity.
As a principle of good design, every table should have a reliable way to uniquely identify a row. Although that is what a primary key is for, it doesn't always require the existence of a primary key. Adding a primary key to every table is not a bad practice since it provides for unique row identification, but it may be unnecessary.
To maintain reliable relationships between the rows of two or more tables, you need to do it via foreign keys, hence the need for primary keys in at least some tables. Adding a primary key to every table makes it easier to extend your database design when it comes time to add new tables or relationships to existing data. Planning ahead is always a good thing.
As a basic principle (hard rule perhaps), the value of a primary key should never change throughout the life of its row. It's wise to assume that any business data in a row is subject to change over its lifetime, so any business data will be a poor candidate for a primary key. This is why something abstract like an auto-incremented integer is often a good idea. However, auto-incremented integers do have their limitations.
If your data will only have a life within your database, auto-incremented integers are fine. But, as has been mentioned in other answers, if you ever want your data to be shared, synchronized, or otherwise have a life outside your database, auto-incremented integers make poor primary keys. A better choice will be a guid (aka uuid "universally unique id").
The question, and many of the answers, miss the important point that all the natural keys for each table reside solely in the logical schema for the database, and all the surrogate keys for each table reside solely in the physical schema for the database. other answers discuss solely the relative benefits of integer versus GUID surrogate keys, without discussing the reasons why surrogate keys are properly used, and when.
BTW: Let us avoid use of the ill defined and imprecise term primary key. It is an artifact of pre-relational data models that was first co-opted (unwisely) into the relational model, and then co-opted back into the physical domain by various RDBMS vendors. Its use serves only to confuse the semantics.
Note from the relational model that, in order for the database logical schema to be in first normal form, every table must have a user-visible set of fields, known as a natural key, that uniquely identifies each row of the table. In most cases such a natural key is readily identified, but on occasion one must be constructed, whether as a tie breaker field or otherwise. However such a constructed key is always still user visible, and thus always resides in the logical schema of the database.
By contrast any surrogate key on a table resides purely in the physical schema for the database (and thus must always, both for security reasons and for maintenance of database integrity, be entirely invisible to database users). The sole reason for introducing a surrogate key is to address performance issues in the physical maintenance and use of the DB whether those be joins, replication, multiple hardware sources for data, or other.
Since the sole reason for the introduction of a surrogate key is performance, let us presume that we wish it to be performant. If the performance issue at hand is joins, then we necessarily wish to make our surrogate key as narrow as can be (without getting in the way of the hardware, so short integers and bytes are usually out). Join performance relies on minimal index height, so a 4-byte integer is a natural solution. If your performance issue is insertion rate a 4-byte integer may also be a natural solution (depending on your RDBMS's internals). If your performance issue for a table is replication or multiple data sources than some other surrogate key technology, be it a GUID or a two-part key (Host ID + integer) may be more suitable. I am not personally a favourite of GUIDs but they are convenient.
To sum up, not all tables will require a surrogate key (of any type) they should only be used when deemed necessary for the performance of the table under consideration. Regardless of which common surrogate key technology you prefer, think carefully about the actual needs of the table before making a choice changing the surrogate key technology choice for a table will be exhausting work. Document the key performance metric for your table so that your successors will understand the choices made.
If your business requirements mandate a sequential numbering of transactions for audit (or other) purposes than that field is not a surrogate key it is a natural key (with extra requirements). From the documentation an auto-incrementing integer only generates surrogate keys, so find another mechanism to generate it. Obviously some sort of monitor will be necessary, and if you are sourcing your transactions from multiple sites then one site will be special, by virtue of being the designated host site for the monitor.
If your table will never be more than about a hundred rows then index height is irrelevant every access will be by a table scan. However string comparisons on long strings will still be much more expensive than comparison of a 4-byte integer, and more expensive than comparison of a GUID.
A table of code values keyed by a char(4) code field should be as performant as one with a 4-byte integer. Although I have no proof of this I use the assumption frequently and have never had reason to rue it.
Creating a table with auto-incrementing IDs in a SQL query
If you’re starting from scratch, you’ll need to create a table in your database that’s set up to auto-increment its primary keys.
When creating an auto-incremented primary key in Postgres, you’ll need to use SERIAL to generate sequential unique IDs. Default start and increment are set to 1.
When applied to our example inventory data set, table creation looks like:
This first step is pretty straightforward. Just be sure to mark your item number column as the PRIMARY KEY .
Auto-incrementing in MySQL is pretty similar to SQL Server, except you don’t manually include the starting value and integer value. Instead, you use the AUTO_INCREMENT keyword, which has a default start and increment value of 1.
The basic syntax for creating this table in MySQL is:
In our example, here’s the table we’d want to create:
Like Postgres, you need to make sure that you’re using the PRIMARY KEY keyword for the column you want to generate your unique IDs from.
➞ SQL Server
In SQL Server, you’ll use the IDENTITY keyword to set your primary key (or item_number in our use case). By default, the starting value of IDENTITY is 1, and it will increment by 1 with each new entry unless you tell it otherwise.
To start, create a table. The basic syntax:
When applied to our inventory test case, table creation will look like:
Again—one last time—make sure to include PRIMARY KEY next to the SERIAL keyword, so you’re not just generating a useless list of sequential numbers.
Now, once you’ve created the framework of your table, you’ll need to decide if the default starting and increment values make sense for your use case.
Subscribe to the Retool monthly newsletter
Once a month we send out top stories (like this one) along with Retool tutorials, templates, and product releases.
6 Answers 6
You could use a CAML query and sort by the particular field, grab the highest value and add one. Should probably have a distinct check as well.
The query can be something like this :
I don't think your approach or the idea of performing a CAML query would work (. at least not consistently), primarily due to your inability to natively manage concurrency. It would be challenging enough to handle it in a single WFE environment, but if you have multiple WFEs you would most assuredly have multiple instances of an event receiver or query executing at the same time, returning the same "last ID" value and then all setting their LogIDNumber property to the same last+1 value.
I suppose you could create an ID list with one column (LogIDNumber), make it a Number type, and make sure Enforce unique values = Yes. You can then add an item to that list in your event receiver and if it chokes then you know another EV instance got there first and you need to requery the last ID until the insert into the ID list doesn't choke. If it doesn't choke then your EV instance owns the ID and can safely update the LogIDNumber property for that list item.
Can I ask what requirement(s) are causing you to create your own auto-icrement column versus using ID?
I had a similar issue, I solved it and since this comes up high on Google for what I was looking for it may help others.
I migrated several Wordpress databases from AWS RDS MySQL to MySQL running on an EC2 instance, using the database migration service. What I didn't know is it doesn't copy indexes, keys, auto increment, or really anything other than the basics. Of course the best approach would be to dump the database using mysqldump and import it manually, but one Wordpress install had significant changes and I didn't want to redo them. Instead I manually recreated the auto_increment values and indexes.
I've documented how I fixed Wordpress auto increment here on my website, here's a copy of what worked for me. It's possible I'll make further changes, I'll update the website but I may not remember to update this question.
- You should check your tables and make sure to set your auto_increment to a value that makes sense for that table.
- If you get the error “alter table causes auto_increment resequencing resulting in duplicate entry 1” (or 0, or something else). This is usually fixed by deleting the entry with the ID 0 or 1 in the table. Note that you should be careful doing this as it could delete an important row.
Why did this happen? Here's what went wrong for me:
If you exported your database using phpadmin and had an error on reimporting it, the code that adds the primary key doesn't run because it's at the end of the SQL file, not at its creation.
Before I figured this out, I updated to the phpmyadmin 5 beta and it imported the files with the key even though I still had the error.
Lesson one is, don't let your import crash, even if your tables are there. Mine crashed on table that began with wp_w so it came after user and rekt my auto increments.
If you look at the bottom of your SQL export, you will find the alter table for adding the Primary Key and the auto increment.
You don't need to specify the auto increment it automatically knows what the next increment is like so:
If you had admin activity since this happened, you have zeros in your key field, which will not allow you to set a primary key, and without that, you can't auto increment. So you need to run a delete script vs each table like so:
Here's a complete set of updates If your table has these, it will throw and error.
4 Answers 4
Tim, I had faced the same issue where I needed to restart the identity to the next value. I was using db2v9.1.
Unfortunately, there is no way to specify the next value automatically. As per DB2 documentation the value should be a 'numeric constant'. Hence I had to do a select max(id), get the value and replace it in the alter..restart stmt manually.
I don't remember if I tried this - but you can write an sp where max(id) is set in a variable and assign the variable in the alter. restart stmt. (I am unable to try as I dont hav access to any db2 database anymore). I doubt it'll work though. (If it works do let me know :))
RESTART or RESTART WITH numeric-constant
Resets the state of the sequence associated with the identity column. If WITH numeric-constant is not specified, the sequence for the identity column is restarted at the value that was specified, either implicitly or explicitly, as the starting value when the identity column was originally created. The column must exist in the specified table (SQLSTATE 42703), and must already be defined with the IDENTITY attribute (SQLSTATE 42837). RESTART does not change the original START WITH value.
The numeric-constant is an exact numeric constant that can be any positive or negative value that could be assigned to this column (SQLSTATE 42815), without non-zero digits existing to the right of the decimal point (SQLSTATE 428FA). The numeric-constant will be used as the next value for the column.
Frequently, we happen to need to fill tables with unique identifiers. Naturally, the first example of such identifiers is PRIMARY KEY data. These are usually integer values hidden from the user since their specific values are unimportant.
When adding a row to a table, you need to take this new key value from somewhere. You can set up your own process of generating a new identifier, but MySQL comes to the aid of the user with the AUTO_INCREMENT column setting. It is set as a column attribute and allows you to generate unique integer identifiers. As an example, consider the users table, the primary key includes an id column of type INT:
Inserting a NULL value into the id field leads to the generation of a unique value inserting 0 value is also possible unless the NO_AUTO_VALUE_ON_ZERO Server SQL Mode is enabled::
It is possible to omit the id column. The same result is obtained with:
The selection will provide the following result:
Select from users table shown in dbForge Studio
You can get the automatically generated value using the LAST_INSERT_ID() session function. This value can be used to insert a new row into a related table.
There are aspects to consider when using AUTO_INCREMENT, here are some:
- In the case of rollback of a data insertion transaction, no data will be added to a table. However, the AUTO_INCREMENT counter will increase, and the next time you insert a row in the table, holes will appear in the table.
- In the case of multiple data inserts with a single INSERT command, the LAST_INSERT_ID() function will return an automatically generated value for the first row.
- The problem with the AUTO_INCREMENT counter value is described in Bug #199 – Innodb autoincrement stats los on restart.
For example, let’s consider several cases of using AUTO_INCREMENT for table1 :
Note: The next AUTO_INCREMENT value for the table can be parsed from the SHOW CREATE TABLE result or read from the AUTO_INCREMENT field of the INFORMATION_SCHEMA TABLES table.
The rarer case is when the primary key is surrogate — it consists of two columns. The MyISAM engine has an interesting solution that provides the possibility of generating values for such keys. Let’s consider the example:
It is quite a convenient solution:
Special values auto generation
The possibilities of the AUTO_INCREMENT attribute are limited because it can be used only for generating simple integer values. But what about complex identifier values? For example, depending on the date/time or [A0001, A0002, B0150…]). To be sure, such values should not be used in primary keys, but they might be used for some auxiliary identifiers.
The generation of such unique values can be automated, but it will be necessary to write code for such purposes. We can use the BEFORE INSERT trigger to perform the actions we need.
Let’s consider a simple example. We have the sensors table for sensors registration. Each sensor in the table has its own name, location, and type: 1 –analog, 2 –discrete, 3 –valve. Moreover, each sensor should be marked with a unique label like [symbolic representation of the sensor type + a unique 4-digit number] where the symbolic representation corresponds to such values [AN, DS, VL].
In our case, it is necessary to form values like these [DS0001, DS0002…] and insert them into the label column.
When the trigger is executed, it is necessary to understand if any sensors of this type exist in the table. It is enough to assign number “1” to the first sensor of a certain type when it is added to the table.
In case such sensors already exist, it is necessary to find the maximum value of the identifier in this group and form a new one by incrementing the value by 1. Naturally, it is necessary to take into account that the label should start with the desired symbol and the number should be 4-digit.
3 Answers 3
Why are you creating something that SharePoint already does?
You can use the SharePoint default column ID in Workflows (as mentioned in the comments of the blogpost you refer to)
ID gets assigned after a New Item is created, so you can not use it in Calculated Column Formulas.
But Workflows are executed after the List Item is created so you can use ID in your workflow.
Only drawback might be, ID can not easily be reset, it will always increment
My new workflow is here here is working perfectly for me.
Event Handler approach can be used also for Autonumbering if you are in programming but I haven't tested it!
I'm trying to think of an alternate solution, but it seems you'll need to use a set of If conditions in the solution - that's if you want the preceding 000 prior to the ID value.
The workflow checks if the ID or the Next Number variable is within a certain range, and then append CaseOPT and the correct number of zeroes prior to the ID value.
Edit: The workflow would look something like this:
This is using the format "CaseOPT####" as the unique id.
I've arranged it this way assuming most of your entries will be in the 1000 - 100 range. Rearrange it depending on how much items you expect and how fast your list grows.
1 Answer 1
I read that uuid does not bring any security advantages
This entirely relative to a given context. So it's neither true or false.
Consider that right now the session id is encrypting the auto-increment id (no uuid is used). If someone manages to know how the session is encrypted, then he can impersonate all the users: encrypt "1" and set the value as sessionID, encrypts "2" and set the value as sessionID, etc.
Session identifiers work if they're long random pieces of information. They do not encode or encrypt any information, these tokens are used by the server to locate information pertaining the established session.
In a typical scenario, client A connects to server B for the first time. They have no information or session id at this point. Server generates a new session id and sends it to client. Potentially authentication occurs and some data is stored on the server pertaining that particular session. Every subsequent request from the client carries this identifier so that the server can match the data relevant to that particular client during this particular session. Notice the data is stored on the server, all the client does is issue requests of whatever kind and tack on the session identifier as a way to maintain state in a stateless system.
Simultaneously other clients are doing the same. The server can maintain multiple states since every client uses their own unique session identifier. If the sessions weren't random or easily guessable, then an attacker could calculate or guess them and hijack established sessions.
So a randomly generated UUID is no better or worse than a randomly generated session identifier for the same length of random data.
Is it good practice to keep 2 related tables (using auto_increment PK) to have the same Max of auto_increment ID when table1 got modified?
This question is about good design practice in programming.
Let see this example, we have 2 interrelated tables:
rID is auto_increment primary key & textID is foreign key
The relationship is that 1 rID will have 1 and only 1 textID but 1 textID can have a few rID .
So, when table1 got modification then table2 should be updated accordingly.
Ok, here is a fictitious example. You build a very complicated system. When you modify 1 record in table1, you need to keep track of the related record in table2. To keep track, you can do like this:
Option 1: When you modify a record in table1, you will try to modify a related record in table 2. This could be quite hard in term of programming expecially for a very very complicated system.
Option 2: instead of modifying a related record in table2, you decided to delete old record in table 2 & insert new one. This is easier for you to program.
For example, suppose you are using option2, then when you modify record 1,2,3. 100 in table1, the table2 will look like this:
This means the Max of auto_increment IDs in table1 is still the same (100) but the Max of auto_increment IDs in table2 already reached 200.
what if the user modify many times? if they do then the table2 may run out of records? we can use BigInt but that make the app run slower?
Note: If you spend time to program to modify records in table2 when table1 got modified then it will be very hard & thus it will be error prone. But if you just clear the old record & insert new records into table2 then it is much easy to program & thus your program is simpler & less error prone.
So, is it good practice to keep 2 related tables (using auto_increment PK) to have the same Max of auto_increment ID when table1 got modified?