Posted By: Anonymous
I started by googling, and found this article which talks about mutex tables.
I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (ie, one query to check and one to insert is the result set is empty)?
unique constraint on a field guarantee the
insert will fail if it’s already there?
It seems that with merely a constraint, when I issue the insert via php, the script croaks.
INSERT IGNORE INTO table
INSERT … ON DUPLICATE KEY UPDATE syntax, you can find explanations on dev.mysql.com
Post from bogdan.org.ua according to Google’s webcache:
18th October 2007
To start: as of the latest MySQL, syntax presented in the title is not
possible. But there are several very easy ways to accomplish what is
expected using existing functionality.
There are 3 possible solutions: using INSERT IGNORE, REPLACE, or
INSERT … ON DUPLICATE KEY UPDATE.
Imagine we have a table:
CREATE TABLE `transcripts` ( `ensembl_transcript_id` varchar(20) NOT NULL, `transcript_chrom_start` int(10) unsigned NOT NULL, `transcript_chrom_end` int(10) unsigned NOT NULL, PRIMARY KEY (`ensembl_transcript_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Now imagine that we have an automatic pipeline importing transcripts
meta-data from Ensembl, and that due to various reasons the pipeline
might be broken at any step of execution. Thus, we need to ensure two
repeated executions of the pipeline will not destroy our
repeated executions will not die due to ‘duplicate
primary key’ errors.
Method 1: using REPLACE
It’s very simple:
REPLACE INTO `transcripts` SET `ensembl_transcript_id` = 'ENSORGT00000000001', `transcript_chrom_start` = 12345, `transcript_chrom_end` = 12678;
If the record exists, it will be overwritten; if it does not yet
exist, it will be created. However, using this method isn’t efficient
for our case: we do not need to overwrite existing records, it’s fine
just to skip them.
Method 2: using INSERT IGNORE Also very simple:
INSERT IGNORE INTO `transcripts` SET `ensembl_transcript_id` = 'ENSORGT00000000001', `transcript_chrom_start` = 12345, `transcript_chrom_end` = 12678;
Here, if the ‘ensembl_transcript_id’ is already present in the
database, it will be silently skipped (ignored). (To be more precise,
here’s a quote from MySQL reference manual: “If you use the IGNORE
keyword, errors that occur while executing the INSERT statement are
treated as warnings instead. For example, without IGNORE, a row that
duplicates an existing UNIQUE index or PRIMARY KEY value in the table
causes a duplicate-key error and the statement is aborted.”.) If the
record doesn’t yet exist, it will be created.
This second method has several potential weaknesses, including
non-abortion of the query in case any other problem occurs (see the
manual). Thus it should be used if previously tested without the
Method 3: using INSERT … ON DUPLICATE KEY UPDATE:
Third option is to use
INSERT … ON DUPLICATE KEY UPDATE
syntax, and in the UPDATE part just do nothing do some meaningless
(empty) operation, like calculating 0+0 (Geoffray suggests doing the
id=id assignment for the MySQL optimization engine to ignore this
operation). Advantage of this method is that it only ignores duplicate
key events, and still aborts on other errors.
As a final notice: this post was inspired by Xaprb. I’d also advise to
consult his other post on writing flexible SQL queries.