MySQL aggregate values on duplicates

This is how you can aggregate values selecting from one table and insert or update to another table using a condition if a certain key/row already exists. This example shows the table of an import from a Apache/LiteSpeed server's access log file `rawlog`. The mission is to aggregate the value based on project, widget and day and by that value either add it to the table of the aggregated values or update it if it already exists. Note that you have to have an unique index key telling what is suppose to be recognized as "already exists". In this case it's a combination of the 3 values of `project`,`widget` and `day`.

If you just want to add the unique index to an existing table you can use:

ALTER TABLE `my_table` ADD UNIQUE (
`project` ,
`widget` ,
`day`
)

Here is the two tables I work with here:

//The raw data table
CREATE TABLE IF NOT EXISTS `old_rawlog` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `ip` varchar(15) NOT NULL,
  `logname` varchar(255) NOT NULL,
  `time` datetime NOT NULL,
  `timezone` varchar(8) NOT NULL,
  `method` varchar(6) NOT NULL,
  `url` text NOT NULL,
  `status` varchar(3) NOT NULL,
  `bytes` int(11) NOT NULL,
  `referer` text NOT NULL,
  `useragent` text NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1;

//The aggregated table
CREATE TABLE IF NOT EXISTS `old_aggregated` (
  `hits` int(11) NOT NULL,
  `project` int(11) NOT NULL,
  `widget` int(11) NOT NULL,
  `ref` text NOT NULL,
  `url` text NOT NULL,
  `day` varchar(10) NOT NULL,
  UNIQUE KEY `project` (`project`,`widget`,`day`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Here is how to SELECT and INSERT or UPDATE in the same MySQL query.

INSERT INTO aggregated (`hits`, `project`, `widget`, `ref`, `url`, `day`)
SELECT COUNT( * ) AS `hits` , SUBSTRING_INDEX( SUBSTRING_INDEX( `url` , 'p=', -1 ) , '&', 1 ) AS `project` , SUBSTRING_INDEX( SUBSTRING_INDEX( `url` , 'w=', -1 ) , '&', 1 ) AS `widget` , LEFT( referer, LOCATE( '/', referer, 8 ) ) AS ref, `url` , DATE( `time` ) AS `day`
FROM `rawlog`
WHERE referer NOT LIKE 'http://somebasesite.com%' AND referer NOT LIKE '-' AND url LIKE '/get_widget_%.png?%'
GROUP BY LEFT( referer, LOCATE( '/', referer, 8 ) ) , `url` , DATE( `time` )
ON DUPLICATE KEY UPDATE aggregated.`hits`= aggregated.`hits`+ VALUES(`hits`);

Simplified version of the above:

INSERT INTO aggregated (`hits`, `project`, `widget`, `ref`, `url`, `day`)
SELECT `hits`, `project`, `widget`, `ref`, `url`, `day`
FROM `rawlog`
WHERE [SOMETHING]
GROUP BY [SOMETHING]
ON DUPLICATE KEY UPDATE aggregated.`hits`= aggregated.`hits`+ VALUES(`hits`);
Knowledge keywords: