Posted on 23 Comments

Shooting yourself in the foot: renumbering auto-inc IDs

warning: this blog entry is a rant

I see so many questions about this… people want to reset the auto-increment number. It’s just a unique number! Why would you care about its exact value, or whether there are ‘gaps’ in the numbering due to deleted rows? And think of the fact that other tables will be referring to these IDs… want to renumber them too? Eek!

Perhaps, during development, you may feel the aesthetic need to restart the numbering after deleting all data… ok, so you do ALTER TABLE tblname AUTO_INCREMENT=1. That should be about it, really. But then, people want to renumber, to start back at 1 and make sure there are no gaps. I ask you!

I am so utterly fed up with this nonsense that I will now give the world a ‘solution’ to renumber a table:

SET @newid=0;
UPDATE tbl SET id=(@newid:=@newid+1) ORDER BY id;

Please note that with this you shoot yourself in the relational foot.

Posted on 23 Comments

23 thoughts on “Shooting yourself in the foot: renumbering auto-inc IDs

  1. This way below, it always works for me 🙂

    //Delete & renumber id(pk) begin —>

    require_once(“config.php”);
    $connection = mysql_connect(“$server”, “$db_user”, “$db_pass”);
    $db = mysql_select_db(“$database”, $connection);
    mysql_query(“DELETE FROM mytable WHERE id=’$id’ LIMIT 1;”);
    $result = mysql_query(‘SELECT * FROM mytable’);
    $row_number = 1;
    while($row = mysql_fetch_array($result)){
    $var = $row[‘id’];
    $result2 = mysql_query(“UPDATE mytable SET id =’$row_number’ WHERE
    id=’$var'”);
    $row_number++;
    }
    mysql_query(“ALTER TABLE mytable AUTO_INCREMENT = 1;”);

  2. Hi Arjen, I couldn’t agree more. If you feel you need a numerical id, it is either some natural property of the object represented by your record, and then you don’t need automatic incrementing. Or it’s just an artificial way to make your objects unique so you can easily and efficiently identify a a record (usually for foreign key lookup).
    But i do know where the urge fill in the gaps comes from. Lots of people dont really know how big an unsigned int can be, and they’re actually afraid that they may see the day when they reach the end of their resevoir of id’s.

  3. Oh dear. Your “solution” is not multi-user safe. If you were using InnoDB, you would have to do at least
    START TRANSACTION
    SELECT .... FOR UPDATE
    (the loop) UPDATE ...
    COMMIT

    Apart from the fact that none of this is necessary in the first place…. go on, fess up… why have you coded this in your app. Why do you feel the need to renumber??

  4. i understand what you are saying about the number shouldn’t matter.
    but for one, we only have 5 products right now and to be misnumbered would be aesthetically stupid.
    and for two, the table is not linked in any relational way.

    I see no way you can argue with those 2 reasons. did you not even try to think of a valid situation?

  5. So that’s the development case I mentioned… I presume that once you go ‘live’, your product #s won’t ever change, so if you ditch one product and add a new one, that would have to be product 6, right? You don’t want to have any confusion there with customers asking for a specific product #.

  6. Correct. I was resetting it because I was doing some testing on a newly developed site with a small amount of good data. You said something similar but you said you still didn’t think resetting the count was necessary.

  7. Hmm? What I said is easily read above.

  8. Hi,
    I was the one posted the first coding for “Delete and renumber”. I think everything is case by case and it shouldn’t be B/W. The previous site I developed had many tables and the certain relationship was corresponded or coordinated by each table entry’s unique pk (id or code or whatever), and the existence of gaps between row(s) caused by deletions did not matter and uniqueness of these pks were more important to maintain the integrity of the whole system. However, the current site I’m developing is using only one table where each registrant’s info being stored in each row with 20 fields or so. Basically we’re organizing festival and people register and pay both online, or in person. So I developed this table where users as well as each team member can update the new registrations, payments or cancellations etc remotely, and eventually the print out of this table (spread sheet) will be used at the festival entrance to verify each guest’s registration and payment status. Now, after few weeks of entries and deletion on this table by users and several team members, I had no choice but to automatically renumber the pk after each deletion because of cancellation or duplicates found in the table. And nobody liked seeing inconsecutive index numbers on the table. It simply doesn’t look good, I mean script always counts number of existing rows and it doesn’t make any sense while it counts 273 existing entries and index ended up 324; you know what I mean?

  9. Why should a user see an entry ID? That’s none of their concern. It might be embedded in a link URL that they can click on to edit an entry, but still they don’t need to see the #.
    The app users would never see the number 324.

    And eh, a relational database is not a spreadsheet, so please don’t treat it as one….

  10. I still don’t understand why you oppose to renumber the pk (index) of single table being used in this application? It seems to me that it is more logical as well as rational solution. Of course it is a piece of cake to hide pk and add another column to index rows in INT order, but why should we have to do it? Don’t you think it is rather a bug in such DB application to begin with? The DB should automatically renumber the pk through out entire DB according to the new change. I’d call it more smart solution 🙂

  11. The PK id goes up also, so ORDER BY id will still get your rows in order of insertion.
    What’s the problem there? The fact that there are gaps is irrelevant to this.

    But… the fundamental issue is that the exact number DOES NOT MATTER.
    It’s the perception that it matters that’s the problem, and I’m afraid that none of your arguments actually touches on this.

  12. Hi there, I’m the one saying that your number is either a natural property and should not increment or … etc.

    I’m here posting back here to try and meet up to that challenge. So, here it goes…

    As for 1, you say ‘misnumbered’ and ‘aesthetically stupid’. Well, I really can’t see how anything could be ‘misnumbered’. Say we’d have these 5 products: a TV, a Radio, an iPOD, a bunch of tiewraps and a pack of batteries. Is there anything about these products that should make the TV come after the iPOD or whatever? If there is something like that, you should not use an AUTO_INCREMENT column to store that information. And esthetically stupid? Come one, I gather that this is a business application, not an work of art. At best, it’s a work of craftmanship, wich could very well involve all kinds of creativity. But if there’s any beauty in that, than it’s not in those numbers.

    As for ‘not being linked in any relational way’: I wrote that in the “artificial identification” case, such a number provides efficient identification, *USUALLY* for foreign key lookup. Now, suppose there are no tables referencing your product table, you surely need that auto_increment for efficient lookup? I mean, why create an identifier if you’re not using it to, well, identify stuff?

  13. Alright, what I see here are different brains, some stubborn, some flexible or whatever. However, What I want to point out is that Mysql is still in its infant stage of its development, light years away from its completion, right?

    What I’m suggesting is to give as many options as possible for user’s preferences whatever they are. Why should we restrict it? We’re not law makers, right? We’re just trying to improve and develop whatever is necessary to be implemented to satisfy each and every user’s needs. Just like billions of shades in weather we see every day. Have you ever seen the exact sky and landscape at the same time in your life more than once?

    I’d say it looks always different, what’d you say?

  14. Is any RDBMS completed? Development will always continue, except for dead products.
    But this story was not about MySQL features, it’s about best practises in design and maintenance. Of course there too you can have different opinions.

    Regarding features, did you see my post on using a 5.0 stored function and trigger to implement sequences. http://www.livejournal.com/users/arjen_lentz/34627.html
    It’s just an example of the flexibility you have. Surely everybody can do what they need.

  15. Without arguing about database practices or anything like that, all I have to say is…

    if you have a page like “product.php?id=15” and I bookmark it, when I come back, I will always expect that product to exist as what I bookmarked. Don’t change IDs unless you REALLY need to.

  16. There is a difference in having incrementing numbers
    for KEYS or for arbitrary data like a product number.

    The keys are for the db and not the user, so it doesn’t
    matter if you have gaps in the seq of keys.

    Props to the user who provided the quick and dirty renumbering
    query.

  17. What other solutions are you, or would you be using, in other RDBMS that would do what you want?

  18. Most reasons to renumber a pk auto_increment column are either misguided or a result of really bad design.

    One reasonable cause to renumber, however, is in the situation where you have a very large (>1,000,000 or so rows) table that you want to see random rows from very quickly. The fastest way to fetch a random row out of such a table currently is to generate a random id from 1 to max(id) and query for that row. If you have gaps in your data you may have to perform several queries before you get a result.

    This is really the only good reason I’ve ever come across, though.

  19. In that case, phrase a smarter query 😉
    Here’s some ideas:
    http://jan.kneschke.de/projects/mysql/order-by-rand/

  20. The idea given in your link is ok if you don’t need true randomness, but note that it will return the row right after a big gap much more often than other rows.

    (Same author as grandparent)

  21. Hmm not sure it wasn’t already on that page, but one method that satisfies that requirement is having an extra column or table with a sequence. This can be generated/updated in a single query.
    Using a separate table would reduce locking contention.

    Like I said – be creative. Not knowing how to work out the exact construct for your situation this doesn’t mean renumbering is a good idea after all 😉

  22. Yes, that is a very good idea if you need to preserve foriegn key relationships. If you do not have any foreign tables referencing the primary key, though, is there a good reason to do this? (I’m asking because I’m wondering, not rhetorically) It would take up a good deal more space and would be marginally slower as records would be accessed using a non-primary key.

    -The grandparent again

  23. Depends. For MyISAM there’s no difference in speed between primary and secondary indexes.
    For InnoDB there is, with its clustered primary key.
    If there are no dependencies, you can use the UPDATE trick I showed in the original post.

Comments are closed.