<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Christian Riesen &#187; MySQL</title>
	<atom:link href="http://christianriesen.com/category/development/mysql/feed/" rel="self" type="application/rss+xml" />
	<link>http://christianriesen.com</link>
	<description>Life and work in the information and communication age</description>
	<lastBuildDate>Mon, 12 Jul 2010 10:39:53 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Sphinx 0.9.9-release finally out</title>
		<link>http://christianriesen.com/2009/12/sphinx-0-9-9-release-finally-out/</link>
		<comments>http://christianriesen.com/2009/12/sphinx-0-9-9-release-finally-out/#comments</comments>
		<pubDate>Mon, 07 Dec 2009 08:39:07 +0000</pubDate>
		<dc:creator>Christian Riesen</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Sphinx]]></category>

		<guid isPermaLink="false">http://christianriesen.com/?p=108</guid>
		<description><![CDATA[What's that supposed to be, I have been asked a couple  [...]]]></description>
			<content:encoded><![CDATA[<p>What&#8217;s that supposed to be, I have been asked a couple of times. <a href="http://www.sphinxsearch.com/" onclick="pageTracker._trackPageview('/outgoing/www.sphinxsearch.com/?referer=');">Sphinx</a> is <strong>the</strong> solution to a very ugly and rather common problem.</p>
<p>You want one entry form a database, selected by a unique ID number. In most cases you have an index, probably even a primary index that will find your record in no time. That is the ideal case though. Now you make lists with many entries, where you already have a much larger set of data to wrangle and again indexes can be your friend, but are less well performing as when you could select them by ID. And worst of all is full text searching. You want all entries that contain a text in multiple fields. Maybe even partial matches and so on. Doing this with just a normal query involves things as LIKE and probably under MySQL the % sign. If you don&#8217;t have many entries this might even work well enough. Queue up large databases and you are completely out of luck.</p>
<p>Sphinx helps you with these scenarios.<br />
<span id="more-108"></span><br />
In Sphinx you run a indexer program that goes into the database and generates it&#8217;s own index for words and even partial words. Depending on the configuration you can teach it to not do partial matches, even match in the middle of the words or treat similar words as one and the same word (both on indexing and on search queries). Switching to Sphinx is very simple. You have to install Sphinx on your server (which is very easy), setup the configuration file to reflect your database and search preferences and then just run the indexer script periodically (depending on your content). Inside your application you change the path that leads to those LIKE queries to ask Sphinx over the supplied API. Now Sphinx does not return the results from the database, but gives you a list of id&#8217;s you can then use to query the database. Now you can run just a fast fetch for those id&#8217;s and you have your result set.</p>
<p>If you just search for all entries from category X or user Z then you are still better off just using the usual ways. But you need full text searching, then Sphinx is your perfect solution.</p>
<p>And now the <a href="http://www.sphinxsearch.com/news/40.html" onclick="pageTracker._trackPageview('/outgoing/www.sphinxsearch.com/news/40.html?referer=');">latest version is out</a>. There are a lot of bugs squished in it, but most not very important, which goes to show how well it has worked now for a long while already. My oldest installation runs for over a year now without a glitch on the Sphinx end.</p>
<p>Currently Sphinx needs to rebuild the whole index each time you want it rebuilt, or use something called delta indexing, which is more a stopgap measure. A real-time updating index is on the menu for the next version though and will make this extremely powerful tool even better.</p>
 <p>Feel free to Flattr this post at <a href="http://flattr.com/" title="Flattr" target="_blank" onclick="pageTracker._trackPageview('/outgoing/flattr.com/?referer=');">flattr.com</a>, if you like it.</p> <p><a href="http://flattr.com/" title="Flattr" target="_blank" onclick="pageTracker._trackPageview('/outgoing/flattr.com/?referer=');"><img src="http://christianriesen.com/wp-content/plugins/flattrss/button-compact-static-100x17.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://christianriesen.com/2009/12/sphinx-0-9-9-release-finally-out/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating a MySQL dump in CSV format</title>
		<link>http://christianriesen.com/2009/06/creating-a-mysql-dump-in-csv-format/</link>
		<comments>http://christianriesen.com/2009/06/creating-a-mysql-dump-in-csv-format/#comments</comments>
		<pubDate>Fri, 26 Jun 2009 07:33:47 +0000</pubDate>
		<dc:creator>Christian Riesen</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Backup]]></category>
		<category><![CDATA[mysqldump]]></category>

		<guid isPermaLink="false">http://christianriesen.com/?p=39</guid>
		<description><![CDATA[Mostly, a dump of a db is wanted in SQL. In case of eme [...]]]></description>
			<content:encoded><![CDATA[<p>Mostly, a dump of a db is wanted in SQL. In case of emergency import file. But some people do not comprehend SQL or their SQL doesn&#8217;t like your SQL and everything goes down the drain. So there is the CSV or Comma-separated values file. As the name says, it separates the values by commas (and more if needed). Since it&#8217;s so dead simple, you will get a lot of different tools and program languages that will make life easy for you to re-import or just search in it. Microsoft Excel and Openoffice Calc both can handle the format as well, so for a quick look, this will do very nicely.</p>
<p>But there is no simple &#8211;csv switch in mysqldump, your weapon of choice for these tasks. So here the command that will allow you to do what you are after:</p>
<p><code>mysqldump -p -u USER -T DIRECTORY --fields-enclosed-by=\" --fields-terminated-by=, DATABASE</code></p>
<p>So this is the short version, and here what it all means:</p>
<ul>
<li>-p : Asks for a password, as most users have one. If you don&#8217;t specify this on a database with a user/password, it will error</li>
<li>-u USER : Replace USER with your actual username to connect to the database.</li>
<li>-T DIRECTORY : This creates a tab delimited file. Not what we wanted but it&#8217;s the base we need.</li>
<li>&#8211;fields-enclosed-by=\&#8221; : Will add &#8221; characters around the fields. This will allow CSV implementations to find everything that fits together. You will need that backslash or it wont run.</li>
<li>&#8211;fields-terminated-by=, : The so much sought after comma. This replaces the tab and puts a comma in its place, which, you guessed it, creates the CSV file.</li>
<li>DATABASE : Well you know, the thing this is all about&#8230;</li>
</ul>
<p>To actually be able to do it though, you will need the FILE privilege on this database. Armed with this, you should be able to do your CSV exports easy now.</p>
<script type="text/javascript">
var flattr_wp_ver = '0.9.11';
var flattr_uid = '756';
var flattr_url = 'http://christianriesen.com';
var flattr_lng = 'en_GB';
var flattr_cat = 'text';
var flattr_tag = 'blog,wordpress,rss,feed';
var flattr_btn = 'large';
var flattr_tle = 'Christian Riesen';
var flattr_dsc = 'Life and work in the information and communication age';
</script>
<script src="http://api.flattr.com/button/load.js?v=0.2" type="text/javascript"></script> <p>Feel free to Flattr this post at <a href="http://flattr.com/" title="Flattr" target="_blank" onclick="pageTracker._trackPageview('/outgoing/flattr.com/?referer=');">flattr.com</a>, if you like it.</p> <p><a href="http://flattr.com/" title="Flattr" target="_blank" onclick="pageTracker._trackPageview('/outgoing/flattr.com/?referer=');"><img src="http://christianriesen.com/wp-content/plugins/flattrss/button-compact-static-100x17.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://christianriesen.com/2009/06/creating-a-mysql-dump-in-csv-format/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Large MySQL Inserts with PHP</title>
		<link>http://christianriesen.com/2009/06/large-mysql-inserts-with-php/</link>
		<comments>http://christianriesen.com/2009/06/large-mysql-inserts-with-php/#comments</comments>
		<pubDate>Wed, 10 Jun 2009 07:10:27 +0000</pubDate>
		<dc:creator>Christian Riesen</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Zend Framework]]></category>

		<guid isPermaLink="false">http://christianriesen.com/?p=7</guid>
		<description><![CDATA[You have to insert 1 Million entries into a database. Y [...]]]></description>
			<content:encoded><![CDATA[<p>You have to insert 1 Million entries into a database. You know there is a structure to it (I&#8217;ll supply one as a sample) so the easiest is to write a PHP script to do this task for you. Said and done, the script is finished and it does the job. But slow as hell. Not only does it trash your hard disk like mad, it also times out on a browser since you don&#8217;t want to do the shell rumba. You have only 20k entries in your database, wasted 30 seconds (while the database still tries to catch up with the sent queries) and now you realize you need a few more rows anyways.</p>
<p>To make life easy, I&#8217;ll use Zend Framework for this. You could use anything else that sends queries, but since I already used it for this project, there was little sense in doing something else.  So, with all useless stuff stripped out, here the code in it&#8217;s first, rather infancy way.</p>
<p><span id="more-7"></span></p>
<pre class="brush: php;">$db = Zend_Registry::get('db');
$x = 0;
$y = 0;
$z = 0;

while($z &lt; 100)
{
	while($y &lt; 100)
	{
		while($x &lt; 100)
		{
			$query = 'INSERT INTO location (locx, locy, locz) VALUES ';
			$query .= '('.$x.', '.$y.', '.$z.')';
			$db-&gt;query($query);
			$x++;
		}
		$y++;
		$x = 0;
	}
	$z++;
	$y = 0;
}</pre>
<p>This sample creates rows for a location in a 3D space in case you wonder. So these lines represent single points in a cube with the side length of 100 whatever units you want to imagine here. It starts at 0,0,0 and ends at 99,99,99. So this gives us our nice 1 million entries.</p>
<p>The $db variable up there is a Zend_Db_Adapter instance, so I can now run queries directly. Yes I know you could write the query differently, but I left it like that to make life easier for the next steps. So instead of firing off 1 Million queries, how about grouping 100 queries together, into one large one? I left the top part out, so I will only post the while loop. The result looks like this.</p>
<pre class="brush: php;">while($z &lt; 100)
{
	while($y &lt; 100)
	{
		$query = 'INSERT INTO location (locx, locy, locz) VALUES ';
		$first = true;
		while($x &lt; 100)
		{
			if ($first == TRUE)
			{
				$first = false;
			}
			else
			{
				$query .= ', ';
			}
			$query .= '('.$x.', '.$y.', '.$z.')';
			$x++;
		}

		$db-&gt;query($query);
		$y++;
		$x = 0;
	}
	$z++;
	$y = 0;
}</pre>
<p>It&#8217;s faster already, but it still does a lot of trashing around on the DB. While we now have &#8220;only&#8221; 10&#8217;000 queries instead of 1 Million, it still is far from &#8220;nice&#8221;. I could extended this and create larger queries, so I end up with 100 queries, each of them having 10&#8217;000 entries, but that might by too large queries and it does not solve much of my problem if I have even more data. Since I&#8217;m using MySQL for this, I have transactions at my disposal as well. So instead of writing it all with every query, I start a transaction, run 100 queries then commit that batch. But wait isn&#8217;t that the same thing than just writing 100 queries? In theory, yes, practically though this changes a few things. Since those queries are small, the system can keep them in RAM and write them in one big write, combined together, instead of 100 small changes, it makes one big one, lightning fast. Here how this looks like.</p>
<pre class="brush: php;">while($z &lt; 100)
{
	$db-&gt;beginTransaction();
	while($y &lt; 100)
	{
		$query = 'INSERT INTO location (locx, locy, locz) VALUES ';
		$first = true;
		while($x &lt; 100)
		{
			if ($first == TRUE)
			{
				$first = false;
			}
			else
			{
				$query .= ', ';
			}
			$query .= '('.$x.', '.$y.', '.$z.')';
			$x++;
		}

		$db-&gt;query($query);
		$y++;
		$x = 0;
	}
	$db-&gt;commit();
	$z++;
	$y = 0;
}</pre>
<p>Now PHP will write all 1 Million rows in under 30 seconds even on a relative small system.</p>
<p>You could of course optimize this, maybe even wrap the whole process in a transaction instead of single steps, but that&#8217;s up to you and your ingenuity. For my needs this did the job so I had no need to go even further with it. Either way you should be able to take it and run with it fast and far for your problem of inserting large amounts of data.</p>
<p>Again, I used Zend Framework, so in your case the beginTransaction and commit function might not exist or named differently.</p>
 <p>Feel free to Flattr this post at <a href="http://flattr.com/" title="Flattr" target="_blank" onclick="pageTracker._trackPageview('/outgoing/flattr.com/?referer=');">flattr.com</a>, if you like it.</p> <p><a href="http://flattr.com/" title="Flattr" target="_blank" onclick="pageTracker._trackPageview('/outgoing/flattr.com/?referer=');"><img src="http://christianriesen.com/wp-content/plugins/flattrss/button-compact-static-100x17.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://christianriesen.com/2009/06/large-mysql-inserts-with-php/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
