<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6138855974683141265</id><updated>2010-06-19T04:53:19.753-06:00</updated><title type='text'>Angry UNIXoid’s Humble Abode</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://blog.mitechki.net/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default?orderby=updated'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>22</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-768997680655132244</id><published>2010-05-05T21:50:00.001-06:00</published><updated>2010-05-12T15:46:57.175-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>How to scrape websites in clojure for fun and profit</title><content type='html'>Let's say, you are hunting for a good deal on a hard drive and you want to monitor prices on &lt;a href="http://newegg.com/"&gt;newegg.com&lt;/a&gt;. You want an internal hard drive of (lets say) over 1TB in size. And you are too lazy to open a browser, so you want to do this in your &lt;a href="http://clojure.org/"&gt;favorite functional programming language&lt;/a&gt;. Well, maybe this is not very plausible, but this is a short primer on parsing web pages using Clojure, so there. You could use a Java-based HTML parser, such as &lt;a href="http://htmlcleaner.sourceforge.net/index.php"&gt;HtmlCleaner&lt;/a&gt;. There was recently an &lt;a href="http://www.sids.in/blog/2010/05/06/html-parsing-in-clojure-using-htmlcleaner/"&gt;excellent article&lt;/a&gt; about it. But lets say, that you would prefer to do it in a more functional style. Well, this is where &lt;a href="http://github.com/cgrand/enlive"&gt;Enlive&lt;/a&gt; comes in. I will assume, that you have emacs, slime, swank-clojure and leiningen all sorted out, so lets start with the meat of the process. The  project.clj should be something like this:&lt;br /&gt;&lt;pre&gt;(defproject newegg &lt;span class="string"&gt;"1.0.0-SNAPSHOT"&lt;/span&gt;&lt;br /&gt;  &lt;span class="builtin"&gt;:description&lt;/span&gt; &lt;span class="string"&gt;"newegg scraping"&lt;/span&gt;&lt;br /&gt;  &lt;span class="builtin"&gt;:dev-dependencies&lt;/span&gt; [[leiningen/lein-swank &lt;span class="string"&gt;"1.2.0-SNAPSHOT"&lt;/span&gt;]]&lt;br /&gt;  &lt;span class="builtin"&gt;:dependencies&lt;/span&gt; [&lt;br /&gt;                 [org.clojure/clojure &lt;span class="string"&gt;"1.1.0"&lt;/span&gt;]&lt;br /&gt;                 [org.clojure/clojure-contrib &lt;span class="string"&gt;"1.1.0"&lt;/span&gt;]&lt;br /&gt;                 [enlive &lt;span class="string"&gt;"1.0.0-SNAPSHOT"&lt;/span&gt;]])&lt;/pre&gt;Now we can start coding, we are going to define selectors for HTML elements we are interested in and then return a map of the data they contain. In this instance, I am aiming to get price, short description and rating.&lt;br /&gt;    &lt;pre&gt;(&lt;span class="keyword"&gt;ns&lt;/span&gt; newegg&lt;br /&gt;  (&lt;span class="builtin"&gt;:require&lt;/span&gt; [clojure.contrib.str-utils2 &lt;span class="builtin"&gt;:as&lt;/span&gt; str2])&lt;br /&gt;  (&lt;span class="builtin"&gt;:require&lt;/span&gt; [clojure.contrib.json.read &lt;span class="builtin"&gt;:as&lt;/span&gt; json])&lt;br /&gt;  (&lt;span class="builtin"&gt;:require&lt;/span&gt; [net.cgrand.enlive-html &lt;span class="builtin"&gt;:as&lt;/span&gt; html]))&lt;br /&gt;&lt;br /&gt;(&lt;span class="keyword"&gt;def&lt;/span&gt; &lt;span class="function-name"&gt;*base-url*&lt;/span&gt; (&lt;span class="builtin"&gt;str&lt;/span&gt; &lt;br /&gt;                 &lt;span class="string"&gt;"http://www.newegg.com/"&lt;/span&gt;&lt;br /&gt;                 &lt;span class="string"&gt;"Product/ProductList.aspx"&lt;/span&gt;&lt;br /&gt;                 &lt;span class="string"&gt;"?Submit=ENE&amp;amp;amp;"&lt;/span&gt;&lt;br /&gt;                 &lt;span class="string"&gt;"N=2010150014%20103530090%201035915133&amp;amp;amp;"&lt;/span&gt;&lt;br /&gt;                 &lt;span class="string"&gt;"bop=And&amp;amp;amp;"&lt;/span&gt;&lt;br /&gt;                 &lt;span class="string"&gt;"ShowDeactivatedMark=False&amp;amp;amp;"&lt;/span&gt;&lt;br /&gt;                 &lt;span class="string"&gt;"Order=RATING&amp;amp;amp;"&lt;/span&gt;&lt;br /&gt;                 &lt;span class="string"&gt;"Pagesize=100"&lt;/span&gt;))&lt;br /&gt;&lt;br /&gt;&lt;span class="comment-delimiter"&gt;;;&lt;/span&gt;&lt;span class="comment"&gt;pick all div elements of class itemCell&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;def&lt;/span&gt; &lt;span class="function-name"&gt;*item-list-selector*&lt;/span&gt; [&lt;span class="builtin"&gt;:div.itemCell&lt;/span&gt;])&lt;br /&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;pick spans of class itemDescription&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;def&lt;/span&gt; &lt;span class="function-name"&gt;*item-description-selector*&lt;/span&gt; [&lt;span class="builtin"&gt;:span.itemDescription&lt;/span&gt;])&lt;br /&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;pick hidden inputs&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;def&lt;/span&gt; &lt;span class="function-name"&gt;*item-price-selector*&lt;/span&gt; [[&lt;span class="builtin"&gt;:input&lt;/span&gt; (html/attr= &lt;span class="builtin"&gt;:type&lt;/span&gt; &lt;span class="string"&gt;"hidden"&lt;/span&gt;)]])&lt;br /&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;pick anchor of class itemRating&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;def&lt;/span&gt; &lt;span class="function-name"&gt;*item-rating-selector*&lt;/span&gt; [&lt;span class="builtin"&gt;:a.itemRating&lt;/span&gt;])&lt;br /&gt;&lt;br /&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;html-data&lt;/span&gt; []&lt;br /&gt;  (html/html-resource (java.net.URL. *base-url*)))&lt;br /&gt;&lt;br /&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;item-list&lt;/span&gt; [] &lt;br /&gt;  (html/select (html-data) *item-list-selector*))&lt;br /&gt;&lt;br /&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;item-properties&lt;/span&gt; [item]&lt;br /&gt;  (list      &lt;br /&gt;   (&lt;span class="builtin"&gt;first&lt;/span&gt; &lt;br /&gt;    (&lt;span class="builtin"&gt;:content&lt;/span&gt; &lt;br /&gt;     (&lt;span class="builtin"&gt;first&lt;/span&gt; &lt;br /&gt;      (html/select item *item-description-selector*))))&lt;br /&gt;   (&lt;span class="builtin"&gt;:value&lt;/span&gt; (&lt;span class="builtin"&gt;:attrs&lt;/span&gt; (&lt;span class="builtin"&gt;first&lt;/span&gt;&lt;br /&gt;                    (html/select item *item-price-selector*))))&lt;br /&gt;   (&lt;span class="keyword"&gt;if&lt;/span&gt; (&lt;span class="builtin"&gt;empty?&lt;/span&gt; (html/select item *item-rating-selector*))&lt;br /&gt;     &lt;span class="string"&gt;""&lt;/span&gt;&lt;br /&gt;     (&lt;span class="builtin"&gt;re-find&lt;/span&gt; #&lt;span class="string"&gt;"\d+$"&lt;/span&gt; &lt;br /&gt;              (&lt;span class="builtin"&gt;:title&lt;/span&gt; &lt;br /&gt;               (&lt;span class="builtin"&gt;:attrs&lt;/span&gt; &lt;br /&gt;                (&lt;span class="builtin"&gt;first&lt;/span&gt;&lt;br /&gt;                 (html/select item *item-rating-selector*)))))))&lt;br /&gt;&lt;br /&gt;  (&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;scrape-and-print&lt;/span&gt; []&lt;br /&gt;    (&lt;span class="keyword"&gt;doseq&lt;/span&gt; [item (item-list)] (println (str2/join &lt;span class="string"&gt;" "&lt;/span&gt; (item-properties item)))))&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-768997680655132244?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/768997680655132244/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2010/05/how-to-scrape-websites-in-clojure-for.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/768997680655132244'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/768997680655132244'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2010/05/how-to-scrape-websites-in-clojure-for.html' title='How to scrape websites in clojure for fun and profit'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-6319107576851753984</id><published>2010-05-06T11:19:00.004-06:00</published><updated>2010-05-12T15:43:16.729-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Namespace trickery in Clojure</title><content type='html'>As you might have guessed from my last post, I have been playing around with web site scraping lately. This posed an interesting problem unrelated to HTML parsing. Each site needs its own function (or with refactoring a bunch of functions) to scrape the data. And generally you want to run these functions on a schedule, so you want a function to run all scrapers. And personally, I like magic, so the I wanted to just add scraper functions and have the aggregator function call them without me doing anything else. At first, I kept all my scrapers in a single scrapers.clj file, so I came up with the following solution.&lt;br /&gt;&lt;pre&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;add scraper metadata&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;defmacro&lt;/span&gt; &lt;span class="function-name"&gt;defscraper&lt;/span&gt; &lt;br /&gt;  [name &amp;amp; decls]&lt;br /&gt;  (list* 'defn- (&lt;span class="builtin"&gt;with-meta&lt;/span&gt; name &lt;br /&gt;                  (&lt;span class="builtin"&gt;assoc&lt;/span&gt; (meta name) &lt;span class="builtin"&gt;:scraper&lt;/span&gt; true)) name decls))&lt;br /&gt;&lt;br /&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;compile a list of defined scrapers&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;defn-&lt;/span&gt; &lt;span class="function-name"&gt;*collect-scrapers*&lt;/span&gt; []&lt;br /&gt;  (&lt;span class="builtin"&gt;filter&lt;/span&gt; &lt;br /&gt;   (&lt;span class="keyword"&gt;fn&lt;/span&gt; [func] (get (meta (val func)) &lt;span class="builtin"&gt;:scraper&lt;/span&gt; false))&lt;br /&gt;   (ns-interns 'com.wombat.web.scrapers)))&lt;br /&gt;&lt;br /&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;run all defined scrapers&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;*run-all-scrapers*&lt;/span&gt; []&lt;br /&gt;  (&lt;span class="keyword"&gt;let&lt;/span&gt; [scrapers (*collect-scrapers*)&lt;br /&gt;        threads (&lt;span class="keyword"&gt;doall&lt;/span&gt; &lt;br /&gt;                 (&lt;span class="keyword"&gt;for&lt;/span&gt; [[name scraper] scrapers] &lt;br /&gt;                   (future (store-site (scraper)))))]&lt;br /&gt;    (&lt;span class="keyword"&gt;doseq&lt;/span&gt; [t threads] (deref t))))&lt;/pre&gt;Then I could just use defscraper instead of defn and voila, any function defined using defscraper would be run in parallel by (*run-all-scrapers*).&lt;br /&gt;&lt;br /&gt;But after a while, several other issues came up. The scrapers file was getting long. I needed to define other function to work with scrapers, like individual functions that would store data from a scraper into a database or return information about the web site etc. So, I split the scrapers file and put each scraper into its own file and its own namespace. At first, I wanted to just refer all the scraper namespaces into the main scrapers namespace, but then I had an idea. What if instead of polluting the main namespace with all the scraper functions, I could keep them in their individual namespaces and find them by a standard name. So, I deleted the defscraper macro, changed all scraper function definitions to defn and called them all scraper. Then I changed the *collect-scrapers* and *run-all-scrapers* to look like this.&lt;br /&gt;    &lt;pre&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;compile a list of defined scrapers&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;defn-&lt;/span&gt; &lt;span class="function-name"&gt;*collect-scrapers*&lt;/span&gt; []&lt;br /&gt;  (&lt;span class="builtin"&gt;map&lt;/span&gt; &lt;br /&gt;   #(get (ns-publics %1) 'scraper) &lt;br /&gt;   (&lt;span class="builtin"&gt;filter&lt;/span&gt; #(contains? (ns-publics %1) 'scraper) (all-ns))))&lt;br /&gt;&lt;br /&gt;&lt;span class="comment-delimiter"&gt;;; &lt;/span&gt;&lt;span class="comment"&gt;run all defined scrapers&lt;br /&gt;&lt;/span&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;*run-all-scrapers*&lt;/span&gt; []&lt;br /&gt;  (&lt;span class="keyword"&gt;let&lt;/span&gt; [scrapers (*collect-scrapers*)&lt;br /&gt;        threads (&lt;span class="keyword"&gt;doall&lt;/span&gt; (&lt;span class="keyword"&gt;for&lt;/span&gt; [scraper scrapers] &lt;br /&gt;                         (future (store-site (scraper)))))]&lt;br /&gt;    (&lt;span class="keyword"&gt;doseq&lt;/span&gt; [t threads] (deref t))))&lt;br /&gt;&lt;/pre&gt;And that is that.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-6319107576851753984?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/6319107576851753984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2010/05/namespace-trickery-in-clojure.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/6319107576851753984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/6319107576851753984'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2010/05/namespace-trickery-in-clojure.html' title='Namespace trickery in Clojure'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-7331818549912594591</id><published>2010-05-11T18:32:00.016-06:00</published><updated>2010-05-12T15:34:19.819-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='clojure'/><title type='text'>Short tutorial on extending Leiningen</title><content type='html'>We all use and love leiningen, the ultimate Clojure build tool. Sometimes, though, we want leiningen to do something it doesn't know how to do. Here is a short and simple tutorial on making your own leiningen tasks. In your project.clj, after the (defproject ...) form, add the following:&lt;br /&gt;&lt;pre class="code"&gt;(&lt;span class="keyword"&gt;ns&lt;/span&gt; leiningen.hello)&lt;br /&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;hello&lt;/span&gt;&lt;br /&gt;[project]&lt;br /&gt;(println &lt;span class="string"&gt;"Hello Leiningen!"&lt;/span&gt;)&lt;br /&gt;(println &lt;span class="string"&gt;"ants"&lt;/span&gt;))&lt;/pre&gt;&lt;br /&gt;Now, when you run lein hello, you will see it print out a message to Leiningen from the ants.&lt;br /&gt;&lt;br /&gt;So, to make a new leiningen task, all you need to do is define a new namespace under leiningen and define a function by the same name. The project variable passed to the function is a hash map containing all project information. For example, here is a slight modification of the hello task.&lt;br /&gt;&lt;pre&gt;(&lt;span class="keyword"&gt;ns&lt;/span&gt; leiningen.hello)&lt;br /&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;hello&lt;/span&gt;&lt;br /&gt;[project]&lt;br /&gt;(println (&lt;span class="builtin"&gt;format&lt;/span&gt; &lt;span class="string"&gt;"Hello from %s project!"&lt;/span&gt; (&lt;span class="builtin"&gt;:name&lt;/span&gt; project))))&lt;/pre&gt;&lt;br /&gt;This should print you a greeting from your project. To see what other information is in the project variable, I came up with the following task.&lt;br /&gt;&lt;pre&gt;(&lt;span class="keyword"&gt;ns&lt;/span&gt; leiningen.info&lt;br /&gt;&lt;span class="string"&gt;"Print all project variables and their values"&lt;/span&gt;&lt;br /&gt;(&lt;span class="builtin"&gt;:use&lt;/span&gt; [clojure.contrib.pprint &lt;span class="builtin"&gt;:only&lt;/span&gt; [pprint pprint-indent]]))&lt;br /&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;info&lt;/span&gt;&lt;br /&gt;[project]&lt;br /&gt;(&lt;span class="keyword"&gt;doseq&lt;/span&gt; [key (&lt;span class="builtin"&gt;keys&lt;/span&gt; project)] &lt;br /&gt;(println (&lt;span class="builtin"&gt;format&lt;/span&gt; &lt;span class="string"&gt;"%s:"&lt;/span&gt; (name key)))&lt;br /&gt;(pprint (get project key))))&lt;/pre&gt;&lt;br /&gt;This is almost all there is to it, there are a couple of additional notes.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;All extra arguments after the task name will be also passed to the task function, so if you want to handle arguments, define your task handler like this (defn sometask [project &amp;amp; args] ... )&lt;/li&gt;&lt;li&gt;Your new task will not appear in the list of available tasks and running help task on it will generate error. This is because leiningen help task uses classpath to look for tasks and will not find anything that is inside the project.clj file. If this is important to you, you can put your task into a separate project, generate a jar file and copy it into the lib directory of your main project.&lt;/li&gt;&lt;li&gt;If you do go for the task jar solution, the help task looks for the doc string in your namespace definition for the help message to display. So, your namespace definition should look like this&lt;br /&gt;&lt;pre&gt;(&lt;span class="keyword"&gt;ns&lt;/span&gt; leiningen.silly&lt;br /&gt;&lt;span class="string"&gt;"This task does something silly"&lt;/span&gt;)&lt;br /&gt;(&lt;span class="keyword"&gt;defn&lt;/span&gt; &lt;span class="function-name"&gt;silly&lt;/span&gt; &lt;br /&gt;[project] &lt;br /&gt;(println &lt;span class="string"&gt;"Your project SUCKS!"&lt;/span&gt;))&lt;/pre&gt;&lt;/li&gt;&lt;/ul&gt;This is it kids.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-7331818549912594591?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/7331818549912594591/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2010/05/short-tutorial-on-extending-leiningen.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7331818549912594591'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7331818549912594591'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2010/05/short-tutorial-on-extending-leiningen.html' title='Short tutorial on extending Leiningen'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-9123716827119507159</id><published>2010-05-02T00:30:00.004-06:00</published><updated>2010-05-12T04:32:02.051-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><category scheme='http://www.blogger.com/atom/ns#' term='vim'/><category scheme='http://www.blogger.com/atom/ns#' term='emacs'/><title type='text'>Why switch from VIM to emacs?</title><content type='html'>&lt;h2&gt;Preface&lt;/h2&gt;OK, this topic has been discussed many times, sometimes, &lt;a href="http://upsilon.cc/~zack/blog/posts/2008/10/from_Vim_to_Emacs_-_part_1/"&gt; by much more competent people then myself&lt;/a&gt;. So, I will quickly reiterate main reasons one might consider switching and proceed to other issues.  &lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Why not Vim?&lt;/h2&gt;&lt;h4&gt;Vim is just fine... for some things.&lt;/h4&gt;I have been using Vim for years (and was quite adamantly against Emacs). I work as a system administrator and for me, vi is one of the main tools of the trade, since it is on every system. On Linux systems you will mostly get Vim installed as the default vi, so learning and using Vim was natural. Most of my editing tasks were involving changing configuration files and writing relatively short scripts. Almost no debugging was involved and there as debugging, it was mostly just run/observe errors/fix script/run again cycle. For this type of use, Vim is perfect. It loads fast, so you can actually quit it every time you are done with editing and most testing/debugging can be accomplished by switching to a terminal window (or even better to a terminal window in a screen session). It is only when you start spending significant amounts of time writing code, Vim deficiencies start coming to light. What deficiencies? There are two main ones.&lt;br /&gt;&lt;h4&gt;Vim is bad at communicating with external processes&lt;/h4&gt;While it is, of course, possible to run shell commands from Vim and even pipe data in the vim buffer, this is not enough. You need to be able to properly interact with a process such as a debugger. You need to send commands to it and capture their output, not run them and forget. Emacs is excelent at this, but Vim either has built-in support for a particular program (like gdb) or you are either out of luck or you will need a lot of hacking (like vimclojure).&lt;br /&gt;&lt;h4&gt;Vim is not very good at editing multiple documents&lt;/h4&gt;Well, while this is not exactly true, Vim supports opening multiple files and recently added tab support, it is not as convenient or feels as natural as in Emacs. Multiple file support in Vim just feels awkward.&lt;br /&gt;&lt;h4&gt;Extending Vim is a pain&lt;/h4&gt;Vim internal scripting language is strange, scripting with other languages compiled into vim, such as ruby or python is limited and not very portable. While many consider LISP to be strange, I find it to be not nearly as strange as vimscript. &lt;br /&gt;&lt;h2&gt;Why Emacs?&lt;/h2&gt;&lt;h4&gt;Emacs is very good at communicating with external processes&lt;/h4&gt;So, you get a lot of benefits of the underlying OS right there in your editor. You also get much better integration with compilers, interpreters, REPL environments etc. You can use &lt;a href="http://en.wikipedia.org/wiki/Interactive_Ruby_Shell"&gt;IRB&lt;/a&gt; and &lt;a href="http://ipython.scipy.org/moin/"&gt;iPython&lt;/a&gt; or many other interactive dynamic language environments right out of the editor and get symbol completion and many other niceties. You can use programs like ssh, telnet or rsync to edit files on remote systems. There are too many uses to enumerate here, but I think you get the point.&lt;br /&gt;&lt;h4&gt;Emacs is easy to configure&lt;/h4&gt;While originally you would have to configure Emacs by writing things in Emacs LISP, it is no longer required. Recent versions of Emacs sport very powerful customization interface, that allows you to change a lot of different aspects of the editor by pointing and clicking on things.&lt;br /&gt;&lt;h4&gt;Emacs is old and the community is obsessive&lt;/h4&gt;While Vim has been around since 1991 and only got proper scripting support in 1998 (some would say in 2001), Emacs has been around since the 70's. And during these 30-something years, many talented people attempted to teach Emacs to do just about anything you could possibly imagine. So, if you want Emacs to do something, chances are, someone somewhere wrote a cute little bit of lisp that does exactly what you want.&lt;br /&gt;&lt;h4&gt;LISP is good for you :)&lt;/h4&gt;And if Emacs is not doing something you want you can change just about anything. And you should. Cause anyone who calls himself a programmer should know at least a little bit of some lisp-like language and it might as well be Emacs LISP. It will alter you perception of reality, open your mind and chakras, walk your dog, neuter your cat and return your library books on time in under 10 lines of code.&lt;br /&gt;&lt;h2&gt;But...&lt;/h2&gt;&lt;h4&gt;But I am so used to Vim&lt;/h4&gt;Emacs has a mode called Viper, that makes Emacs behave in Vimish way. It has different levels, in order to gradually phase out your Vim habits. If you tend to enter cold pool by first dipping your little toe, you might want to start with Viper. I am more of a dive, head-first, while screaming obscenities person, so I do not use it.&lt;br /&gt;&lt;h4&gt;But Emacs takes forever to load&lt;/h4&gt;Well, first, it is not true. A simple Emacs setup loads as fast as simple Vim setup and a complicated Vim setup loads as slowly as a complicated Emacs setup. And at that Emacs has autoload ability that allows you to only load minimally required stuff at the startup and load the rest when it is actually required. And Emacs LISP can be byte-compiled to speed up loading times. And in any case, Emacs is more of a programmer's editor, not sysadmins (I am having my doubts, but so I heard), so it is not really intended to be closed after every edit. It is intended to be loaded once at the start of the day &lt;strike&gt;and never stopped again&lt;/strike&gt; and possibly stopped when the work is over, but not necessarily.&lt;br /&gt;&lt;h4&gt;But all those parentheses are awful!!!&lt;/h4&gt;No, they are not. They are beautiful. And if you let Emacs do the indentation and turn on highlite-parenthesis-mode, they are even more awesome. And anyway, I think a person who is used to typing things like :g/^"foo.*?"/d and :s/^foo\(.*\)bar$/bar\1foo/ shouldn't complain about syntax.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-9123716827119507159?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/9123716827119507159/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2010/05/why-switch-from-vim-to-emacs.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/9123716827119507159'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/9123716827119507159'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2010/05/why-switch-from-vim-to-emacs.html' title='Why switch from VIM to emacs?'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-553548906282113625</id><published>2008-07-31T10:31:00.002-06:00</published><updated>2010-05-12T03:40:20.158-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><title type='text'>crontab to english translator</title><content type='html'>A couple of years ago I have written this script, that takes crontab entries from standard input, parses them and prints english translation. It is definitely not perfect and will bail at a lot of valid crontab entries, but for all it is worth here it is.&lt;br/&gt;&lt;pre class="brush: python"&gt;&lt;br/&gt;#!/usr/bin/python&lt;br/&gt;&lt;br/&gt;import re&lt;br/&gt;import os&lt;br/&gt;import sys&lt;br/&gt;import string&lt;br/&gt;&lt;br/&gt;class CronJob:&lt;br/&gt;        """A class describing a scheduled job."""&lt;br/&gt;        def __init__(self, str):&lt;br/&gt;                """&lt;br/&gt;                Generate a new object from a crontab line. We should differentiate between the following types of crontabs:&lt;br/&gt;                1. something = something (raise exception)&lt;br/&gt;                2.   (classic cron shedule)&lt;br/&gt;                3. [!&amp;amp;]word(arg)[,word(arg)...]   (fcron style schedule)&lt;br/&gt;                4. #somestuff (comment, raise exception)&lt;br/&gt;                5.  (empty line, raise exception)&lt;br/&gt;                """&lt;br/&gt;&lt;br/&gt;                if re.compile("^\s*$").search(str):&lt;br/&gt;                        raise NotACronJobError("EMPTY")&lt;br/&gt;                elif re.compile("^\s*#").search(str):&lt;br/&gt;                        m = re.compile("^\s*#(.*)").search(str)&lt;br/&gt;                        raise NotACronJobError("COMMENT", m.group(1))&lt;br/&gt;                elif re.compile("^\s*\S+\s*=.+").search(str):&lt;br/&gt;                        m = re.compile("^\s*(\S+?)\s*=\s*(.+)").search(str)&lt;br/&gt;                        raise NotACronJobError("VARIABLE", m.group(1), m.group(2))&lt;br/&gt;                elif re.compile("^(\*|\d+)").search(str) or re.compile("^[!&amp;amp;]\w+").search(str):&lt;br/&gt;                        if re.compile("^!.+?\)\s*$").search(str): raise NotACronJobError("GARBAGE", str)&lt;br/&gt;                        self._parseLine(str)&lt;br/&gt;                        return&lt;br/&gt;                else:&lt;br/&gt;                        raise(NotACronJobError("GARBAGE", str))&lt;br/&gt;&lt;br/&gt;        def _parseLine(self, str):&lt;br/&gt;                if re.compile("^[!&amp;amp;]\w+").search(str):&lt;br/&gt;                        self.type = "fcron"&lt;br/&gt;                        m = re.compile("^\S+\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(.+)").search(str)&lt;br/&gt;                else:&lt;br/&gt;                        self.type = "vixie"&lt;br/&gt;                        m = re.compile("^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(.+)").search(str)&lt;br/&gt;                self.min = self._parseDateTime(m.group(1), "min")&lt;br/&gt;                self.hr = self._parseDateTime(m.group(2), "hr")&lt;br/&gt;                self.dom = self._parseDateTime(m.group(3), "dom")&lt;br/&gt;                self.mon = self._parseDateTime(m.group(4), "mon")&lt;br/&gt;                self.dow = self._parseDateTime(m.group(5), "dow")&lt;br/&gt;                self.cmd = self._parseCmd(m.group(6))&lt;br/&gt;&lt;br/&gt;        def _parseDateTime(self, dt, type):&lt;br/&gt;                min = range(0,59)&lt;br/&gt;                hr = range(0,23)&lt;br/&gt;                dom = range(1,31)&lt;br/&gt;                mon = range(1,12)&lt;br/&gt;                dow = range(0-7)&lt;br/&gt;                if dt == "*":&lt;br/&gt;                        return None&lt;br/&gt;                elif re.compile("^\d+$").search(dt):&lt;br/&gt;                        return range(int(dt),int(dt) + 1)&lt;br/&gt;                elif re.compile(",").search(dt):&lt;br/&gt;                        dts = dt.split(",")&lt;br/&gt;                        parsed = [self._parseDateTime(x, type) for x in dts]&lt;br/&gt;                        res = []&lt;br/&gt;                        for x in parsed:&lt;br/&gt;                                if res == None: res = []&lt;br/&gt;                                res = res.extend(x)&lt;br/&gt;                                return res&lt;br/&gt;                elif re.compile("\/").search(dt):&lt;br/&gt;                        m = re.compile("(.+?)/(.+)").search(dt)&lt;br/&gt;                        r = m.group(1)&lt;br/&gt;                        st = m.group(2)&lt;br/&gt;                        if r == "*":&lt;br/&gt;                                r = eval(type)&lt;br/&gt;                        else:&lt;br/&gt;                                (x,y) = r.split("-")&lt;br/&gt;                                r = range(int(x),int(y))&lt;br/&gt;                        return range(r[0], r[-1], int(st))&lt;br/&gt;                elif re.compile("-").search(dt):&lt;br/&gt;                        m = re.compile("(\d+)-(\d+)").search(dt)&lt;br/&gt;                        return range(int(m.group(1)),int(m.group(2)))&lt;br/&gt;                else:&lt;br/&gt;                        raise NotACronJobError("GARBAGE", dt)&lt;br/&gt;&lt;br/&gt;        def _parseCmd(self, cmd):&lt;br/&gt;                if re.compile("^\s*root\s*").search(cmd):&lt;br/&gt;                        cmd = re.compile("^\s*root\s*").sub("", cmd)&lt;br/&gt;                return cmd&lt;br/&gt;&lt;br/&gt;        def __str__(self):&lt;br/&gt;                s = "Run %s" % self.cmd&lt;br/&gt;                if self.mon != None:&lt;br/&gt;                        months = ("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")&lt;br/&gt;                        s = s + " in " + ",".join([months[x] for x in self.mon])&lt;br/&gt;                if self.dom != None:&lt;br/&gt;                        tmp = ",".join(["%sth" % x for x in self.dom])&lt;br/&gt;                        tmp = tmp.replace("1th", "1st")&lt;br/&gt;                        tmp = tmp.replace("2th", "2nd")&lt;br/&gt;                        tmp = tmp.replace("3th", "3rd")&lt;br/&gt;                        s = s + " on " + tmp + " day"&lt;br/&gt;                        if self.mon == None:&lt;br/&gt;                                s = s + " of every month"&lt;br/&gt;                if self.dow != None:&lt;br/&gt;                        week = ("sunday", "monday", "tuesday", "wednesday", "thirsday", "friday", "saturday")&lt;br/&gt;                        s = s + " on " + ",".join([week[x] for x in self.dow])&lt;br/&gt;                if self.hr != None:&lt;br/&gt;                        if len(self.hr) == 1 and len(self.min) == 1:&lt;br/&gt;                                s = s + " at %s:%s" % (string.zfill(self.hr[0],2),string.zfill(self.min[0],2))&lt;br/&gt;                        else:&lt;br/&gt;                                s = s + " at " + ",".join([str(x) for x in self.hr])&lt;br/&gt;                        if self.dow == None and self.dom == None:&lt;br/&gt;                                s = s + " every day"&lt;br/&gt;                else:&lt;br/&gt;                        s = s + " at %s minutes" % ",".join([str(x) for x in self.min]) + " of every hour "&lt;br/&gt;                return s&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;class NotACronJobError(Exception):&lt;br/&gt;        """An exception raised by CronJob to indicate that the line in question doesn't contain a vaild cron schedule information."""&lt;br/&gt;        def __str__(self):&lt;br/&gt;                if self.args[0] == "EMPTY":&lt;br/&gt;                        return "Empty Line"&lt;br/&gt;                elif self.args[0] == "COMMENT":&lt;br/&gt;                        return "A comment: %s" % self.args[1]&lt;br/&gt;                elif self.args[0] == "VARIABLE":&lt;br/&gt;                        return "An environment variable: %s = %s" % (self.args[1], self.args[2])&lt;br/&gt;                elif self.args[0] == "GARBAGE":&lt;br/&gt;                        return "Uncronish thingamabob: %s" % self.args[1]&lt;br/&gt;                else:&lt;br/&gt;                        return "If you don't know how to play with me, go to the other sandbox!"&lt;br/&gt;&lt;br/&gt;if __name__ == "__main__":&lt;br/&gt;        for line in sys.stdin:&lt;br/&gt;                try:&lt;br/&gt;                        print CronJob(line)&lt;br/&gt;                except NotACronJobError, err:&lt;br/&gt;                        print err&lt;br/&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-553548906282113625?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/553548906282113625/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2008/07/crontab-to-english-translator.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/553548906282113625'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/553548906282113625'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2008/07/crontab-to-english-translator.html' title='crontab to english translator'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-8454490390419699363</id><published>2010-04-29T21:22:00.001-06:00</published><updated>2010-05-12T02:25:10.946-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='misc'/><title type='text'>Resuming posting</title><content type='html'>This blog has been on a hiatus for a while, mostly because I was busy or lazy or both. Now I will try and resume occasional posting. I think, I will start with some posts on switching from VIM to Emacs (as if that has never been blogged before) and setting up and using Clojure (same for this). And than I will see where that takes me.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-8454490390419699363?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/8454490390419699363/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2010/04/resuming-posting.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/8454490390419699363'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/8454490390419699363'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2010/04/resuming-posting.html' title='Resuming posting'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-3923750973995179552</id><published>2008-05-16T12:41:00.002-06:00</published><updated>2010-05-12T02:24:46.143-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Restoring MySQL databases CLI trick</title><content type='html'>It is very easy to dump and restore a database using mysql and mysqldump CLI utilities, just&lt;br/&gt;&lt;code&gt;&lt;br/&gt;# backup&lt;br/&gt;mysqldump --single-transaction mydb &amp;gt; dump.sql&lt;br/&gt;#restore&lt;br/&gt;mysql mydb &amp;lt; dump.sql&lt;br/&gt;&lt;/code&gt;&lt;br/&gt;and you are all. Unfortunately, if your database is several gigabytes and takes a long time to restore you might want to have some sort of output, to indicate where in the process your backup or restore is. For backup you just add -v flag to your mysqldump command and it will throw out some information about which table it is backing up. What about restore? While it is definitely possible to just go and check what table is being restored (mysqldump dumps tables in alphabetical order), I came up with a little clever trick to make the restore progress obvious and similar to mysqldump. Just add perl.&lt;br/&gt;&lt;code&gt;&lt;br/&gt;cat dump.sql | perl -ne '/Table structure for table \`(.*?)\`/ &amp;amp;&amp;amp; do {chomp($t=`date`); print STDERR $t . " loading $1\n";}; print' | mysql mydb&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-3923750973995179552?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/3923750973995179552/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2008/05/restoring-mysql-databases-cli-trick.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/3923750973995179552'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/3923750973995179552'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2008/05/restoring-mysql-databases-cli-trick.html' title='Restoring MySQL databases CLI trick'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-3178905368889716638</id><published>2008-03-14T16:53:00.001-06:00</published><updated>2010-05-12T02:24:33.252-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><title type='text'>Why I don't like Debian based distributions.</title><content type='html'>I have been happily using Fedora for a while now, but I keep a close eye on Ubuntu development, since it is my humble opinion, that nothing, at the moment, compares to Ubuntu in ease of use, hardware compatibility and general togetherness. I recommend Ubuntu to people who want to try out Linux, I ran Ubuntu myself for a while, I run beta versions of Ubuntu releases and file bugs (well, when I have time). Now I also have an Eee PC laptop running Ubuntu. I like Ubuntu. But I run Fedora as my main OS. The reason for this is Ubuntu being Debian derivative and as such dragging with it all the horrible Debian legacy. I honestly wish Ubuntu chose a different What is horrible about Debian? Well, this post intends to list a few things that annoy the hell out of me and that IMHO should have been fixed ages ago. Yes, I am aware, that I blaspheme.&lt;br/&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Package installation procedure&lt;/strong&gt; - when a list of packages is being installed or upgraded, Debian package manager or DPKG does this procedure in stages. That is it will first unpack all the packages, then run all the pre-install scripts, then install the files, then run the post-install scripts etc. (I am not trying to be correct about the exact steps here). And while this behavior might seem logical therein lies a problem. If, for example, one of the packages' post-install scripts fails, dpkg reports a problem and quits and all the rest of the packages remain unconfigured. True, dpkg will continue where it left off when the suer resolves whatever problem is causing the script to fail or removes the offending package, but this is not the point. Lets consider a case where actual updated package is broken and script fails because of a syntax error. Once dpkg fails the system ends up in rather strange state. All the services that were to be updated were stopped, but weren't started again (since that happens in the post-install). New libraries were unpacked, but ldconfig weren't run. New kernel might have been installed, but new initrd wasn't generated and boot manager wasn't updated. Basically we have a broken system that needs careful fixing by a specialist who knows what he is doing. And even if you do know what you are doing, your choices are limited. You need to either fix the script yourself, repackage and reinstall, but that makes your system somewhat inconsistent or you need to completely remove the package, rerun dpkg to finish the install/upgrade of other packages in the queue&amp;nbsp; and try to reinstall the old version back, but that might not be possible since all the other packages might prevent the old version from being installed, so you need todescend the dependency hell and start selectively uninstalling and downgrading packages to get a working version of whatever software. Yes, some of it is also true about RPMs, but at least when one of the RPM installs fails all the rest of the packages are either NOT installed or installed COMPLETELY nothing except possibly the broken package is done half way.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Package state markings&lt;/strong&gt; &lt;strong&gt;- &lt;/strong&gt;as I mentioned in previous paragraph, when dpkg fails to do some of its tasks it can be rerun and will proceed from the point it stopped (or fail in the same place). This is done by having very granular records of package state. APT seems to like to mark packages just a little bit too much and annoys its user. Lets say, I have started an install of a package that needs a 100MB of dependencies and suddenly I need to go somewhere. So, I hit CTRL-C, close the laptop and run out. Later I find that my laptop doesn't have a reader of some sort installed, for example FB reader and I need it right away to read some document. I hit apt-get install fbreader, but suddenly the whole 100MB of stuff starts downloading again. Why? Because APT marked all those packages for installation and will install them unless they are unmarked. Honestly I don't know how to easily unmark packages marked for install/upgrade short of doing dpkg --get-selections, manually editing the resulting list and piping it into dpkg --set-selections. There maybe a way to do this using GUI interface such as synaptic, but at a glance I couldn't find it. Other example of this "feature" is when you are trying to remove packages. Sometimes you see a package and think "I don't need this, why is it installed", so you dpkg -P it. And suddenly dpkg tells you, that the package is actually a dependence of something or other. But although dpkg proudly reported that it did not remove the package in question because of dependency problems, it DID however mark the package as "to be removed", so if ever the dependencies change this package might just disappear without any intervention.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;strong&gt;SysV scripts&lt;/strong&gt; - Debian like most other Linux distros uses SysV startup. One feature though seems to be specifically done to annoy the hell out of the user. Every time a service that has a startup script is upgraded, it automatically setup to start at boot. Even if it was manually turned off before. In Fedora I can say chkconfig httpd off and Apache will not start until I say otherwise. On a less sophisticated system I can say something like rm -f /etc/rc?.d/S*httpd to achieve the same result. On Debian I can update-rc.d -f remove apache, but once an upgrade to the apache package is installed it will reinstate itself on its default runlevels and happily start on boot. As far as I know, there is NO way to prevent this. Ridiculous.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Package management command set&lt;/strong&gt; - this is not as much a problem as a way to a lot of confusion and it is not restricted to the package management system. There is just too much legacy in the Debian native commands. The package system provides a very good illustration. In Fedora I generally use two package management commands, yum and rpm. Yum mostly works with remote repositories and handles package installs, upgrades etc. RPM works with locally installed packages and manages installing from local file, querying the package DB, removing packages, package signing keys etc. In Debian it is not as simple. To install from remote repositories I use either apt-get. To search remote repositories I use apt-cache. To install from local file or remove package I use dpkg. To query package database I use dpkg-query. To manage keys I use apt-key. Each of these has its own specific subcommands and flags.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;strong&gt;DEFOMA&lt;/strong&gt; - the Debian Font manager. Basically this is a convoluted something that is supposed to make all the font management automagical. Unfortunately all it seems to do is confuse anyone who tries to figure out what happens to fonts on the system.&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-3178905368889716638?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/3178905368889716638/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2008/03/why-i-don-like-debian-based.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/3178905368889716638'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/3178905368889716638'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2008/03/why-i-don-like-debian-based.html' title='Why I don&amp;#39;t like Debian based distributions.'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-2328822180762625181</id><published>2007-11-27T11:31:00.001-07:00</published><updated>2010-05-12T02:24:19.757-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='misc'/><title type='text'>Yet another post about firefox extensions</title><content type='html'>Previously I have written about &lt;a href="http://mitechki.net/blog/2007/03/30/firefox-extensions-to-install-first/"&gt;various useful extensions&lt;/a&gt; for &lt;a href="http://www.mozilla.com/en-US/firefox/"&gt;Firefox&lt;/a&gt;. Recently I have tested quite a few extensions that didn't make my "install these first" list and although I do not think any of these are of the "you are not browsing right if you do not have this" grade, but I find some of them rather nice additions to my web experience.&lt;br/&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.customizegoogle.com/"&gt;CustomizeGoogle&lt;/a&gt; - is one of the subtle, yet extremely powerful extensions. Once you install it, suddenly your experience with Google search, GMail, Google Calendar and other Google products just becomes nicer. You get Google Suggest keywords while you type, GMail auto-redirects to an encrypted version, you get links to other search engines in your search results, Google Images starts to actually point to images etc.&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/4810"&gt;SpeedDial&lt;/a&gt; - if you ever used Opera, you already know what this is about. Basically, it adds a special location (you can configure it to be your home page) that shows thumbnails of several (nine by default) sites of your choice with handy shortcuts to go straight to these sites. I used to keep a lot of tabs open at all times in my Firefox sessions, in order to have all the reference documentation I need at hand at all times. Now I just assign relevant pages to my Speed Dial and voila, CTRL-1 gives me a tab with Apache 2.2 manuals, CTRL-2 - tab with MySQL reference manual etc. Or I can just open a new tab and click on whatever I need right there. Again, I can see people saying that this is just an unneeded addition to bookmarks and bookmark toolbar. Bookmark toolbar takes screen space. Bookmarks are nice, but since you cannot assign shortcut to a particular bookmark (as far as I know), Speed Dial actually does speed up getting to your favorite sites even if just a little bit. As an alternative, one can always use bookmark keywords (one of the more obscure Firefox features). For example, you can bookmark Slashdot.org and assign keyword slash to it. Then you use CTRL-T to open new tab, CTRL-L to switch focus to the location bar, type slash and hit enter. This is much faster, then browsing bookmarks menu with a mouse (especially for the keyboard oriented people like me), but not as fast or visually friendly as using Speed Dial extension. After all with Speed Dial you do not need to remember keywords. &lt;strong&gt;Note: &lt;/strong&gt;Obviously some of these arguments are useless for people who use mouse more then keyboard. But I would guess that with one of the mouse gesture extensions you should be able to map Speed Dials to gestures.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/4429"&gt;Secure Login&lt;/a&gt; - this one is even more subtle. If you are using Password Manager to remember your login information, you might sometimes be annoyed that it fills out your login info weather you actually want it or not. The Secure Login will change this behavior to a more appropriate. Every time there is a login form on the page, Secure Login will search the Password Manager for a fitting login/password combo and if it finds one it will highlight the form fields with yellow, light up an icon in the status bar and may, if configured, even play a sound. It will prevent the P.M. from filling the info into the form. Pressing a shortcut key or clicking a toolbar button will fill the form and submit in one motion (or just fill the form if you are so inclined). It can warn you if the form is attempting to submit something to a domain different from the page the form is located on and will show a popup to indicate where the form will be sent.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/3694"&gt;Resizable Form Fields&lt;/a&gt; - does exactly what its name suggests. It allows you to resize text fields, text areas, combo-boxes and lists. Well... most of the time at least. I have seen a few sites where it doesn't work (probably due to absolute positioning or some other CSS tricks). But where it works it is a nice feature to have.&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1813"&gt;TrashMail.net&lt;/a&gt; - will add a menu item "Paste a disposable email address" next to Paste. When used it will use trashmail.net site to generate a temporary email address. This is very useful when trying to read an article from some suspicious site that requires registration.&lt;/li&gt;&lt;li&gt;&lt;a href="http://roachfiend.com/archives/2005/02/07/bugmenot/"&gt;BugMeNot&lt;/a&gt; - will use the bugmenot.com login database to login into those annoying sites that require you to register in order to read. New York Times is one of the popular examples. Yes, this is a morally questionable practice, but those compulsive registration dudes are just soooo annoying and I am not a lawyer to be able to properly read their "Privacy Policy" documents :)&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/2871"&gt;URL Fixer&lt;/a&gt; - this is probably the subtlest one. It will quietly fix basic typos in URLs. Ever typed www.google.con or wwww.gmail.com? No more.&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/427"&gt;ScrapBook&lt;/a&gt; - this is one of the more non-obvious and extremely powerful add-ons. ScrapBook will allow you to properly gather and organize the data you mine on the web and will give you some tools to properly work with the materials. On a more particular note, ScrapBook will allow you to save a page or a fragment of a page completely to your hard drive. It will allow you to organize these fragments and pages into folders (same as you would organize bookmarks). It will allow you to mark up (same as with a highlighter pen) parts of the pages you saved and add notes and annotations. Since ScrapBook will actually save the data locally you will not worry about the data going off line or changing at the original location. This is a beautiful tool to do research on the web.&lt;/li&gt;&lt;/ul&gt;These are the extensions for a common user of Firefox that I have recently added to my add-on arsenal. Stay tuned for my post about some other extensions which are more useful to developers, hackers and power users.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-2328822180762625181?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/2328822180762625181/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/11/yet-another-post-about-firefox.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/2328822180762625181'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/2328822180762625181'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/11/yet-another-post-about-firefox.html' title='Yet another post about firefox extensions'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-3559968622388333644</id><published>2007-10-31T10:32:00.001-06:00</published><updated>2010-05-12T02:24:08.045-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='misc'/><title type='text'>VirtualBox - the VMware alternative</title><content type='html'>Yesterday I have discovered &lt;a href="http://virtualbox.org"&gt;VirtualBox&lt;/a&gt;. In short, VirtualBox is yet another virtualization package. It provides more or less the same function as VMware, Xen, Qemu and VirtualPC. At the moment it is happily running a FreeBSD world build as a guest on my Fedora 8 workstation. I cannot say that my testing of this product is complete, as far as first impressions go, this is fairly favorable. Out of the features Lets split these impressions into three usual categories.&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;The Good:&lt;/strong&gt;&lt;br/&gt;&lt;ul&gt;&lt;li&gt;Support virtualization extensions of the modern CPUs&lt;br/&gt;&lt;/li&gt;&lt;li&gt;Seems less I/O intensive then VMware&lt;/li&gt;&lt;li&gt;Works on FreeBSD&lt;br/&gt;&lt;/li&gt;&lt;/ul&gt;&lt;strong&gt;The Bad:&lt;/strong&gt;&lt;br/&gt;&lt;ul&gt;&lt;li&gt;The GUI is somewhat clunky&lt;/li&gt;&lt;li&gt;No script to automatically configure the kernel module and network&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;The Ugly:&lt;br/&gt;&lt;/b&gt;&lt;ul&gt;&lt;li&gt;In order to activate the kernel module, I had to guess the location of the module source and run make &amp;amp;&amp;amp; make install from CLI.&lt;/li&gt;&lt;li&gt;In order to activate bridged networking I had to manually configure ethernet bridging&lt;/li&gt;&lt;li&gt;Once the VM crashed without any reason&lt;/li&gt;&lt;li&gt;Sometimes FreeBSD guest seems to have some problems with the virtual CPU.&lt;br/&gt;&lt;/li&gt;&lt;/ul&gt;Overall the experience was not all bad. There are some things which I think can be smoother, but it works. Good luck to the developers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-3559968622388333644?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/3559968622388333644/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/10/virtualbox-vmware-alternative.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/3559968622388333644'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/3559968622388333644'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/10/virtualbox-vmware-alternative.html' title='VirtualBox - the VMware alternative'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-4199954953665836759</id><published>2007-07-31T10:44:00.001-06:00</published><updated>2010-05-12T02:23:54.370-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>MySQL features I would kill for.</title><content type='html'>It seems that nowadays there is a trend in writing "top 10 features I want software X to have". I have seen at least two such posts about&lt;br/&gt;MySQL, &lt;a href="http://mysqldatabaseadministration.blogspot.com/2007/06/my-top-x-wishlist-for-mysql.html"&gt;here&lt;/a&gt; and &lt;a href="http://www.mysqlperformanceblog.com/2007/06/29/top-5-wishes-for-mysql/"&gt;here&lt;/a&gt;. So, since I have been working with MySQL for a while, here is my list:&lt;br/&gt;&lt;ol&gt;&lt;br/&gt;&lt;li&gt;File per table backup mode for mysqldump that would work with --single-transaction flag&lt;/li&gt;&lt;br/&gt;&lt;li&gt;Clustering without the NDB in memory storage&lt;/li&gt;&lt;br/&gt;&lt;li&gt;Ability to turn logs (query, binary, slow queries) on and off without restarting&lt;/li&gt;&lt;br/&gt;&lt;li&gt;Ability to setup log filters (such as log queries using particular table into a separate file or log queries scanning more then 10K rows)&lt;/li&gt;&lt;br/&gt;&lt;li&gt;Ability to use bound variables in prepared statements properly (such as use variables in LIMIT or pass table names in the variables)&lt;/li&gt;&lt;br/&gt;&lt;li&gt;Proper implementation of views (proper, as in not involving running a select every time a view is queried)&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-4199954953665836759?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/4199954953665836759/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/07/mysql-features-i-would-kill-for.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/4199954953665836759'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/4199954953665836759'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/07/mysql-features-i-would-kill-for.html' title='MySQL features I would kill for.'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-8911212008115102400</id><published>2007-06-06T12:51:00.001-06:00</published><updated>2010-05-12T02:23:31.053-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><title type='text'>Fedora 7 and ATI binary drivers. An Ugly Hack.</title><content type='html'>There is a known problem with the recently released Fedora 7 and ATI video cards.&lt;br/&gt;&lt;ul&gt;&lt;li&gt;Most recent driver (version 8.37.6) causes X server to segfault&lt;/li&gt;&lt;li&gt;Older drivers do not support new Xorg versioning system (server reports 1.3 and driver expects &amp;gt;7)&lt;br/&gt;&lt;/li&gt;&lt;li&gt;Xorg open source ATI drivers do not have support for anything past Radeon 9250 (due to ATI not disclosing specs)&lt;/li&gt;&lt;li&gt;Xorg VESA driver doesn't support either 3D acceleration or multi screen and is generally rather slow&lt;/li&gt;&lt;/ul&gt;All this caused Michael Larabel (who seems to know most about the state of ATI drivers for Linux) to warn people &lt;a href="http://www.michaellarabel.com/index.php?k=blog&amp;amp;i=224"&gt;not to upgrade to Fedora 7&lt;/a&gt; just yet.&lt;br/&gt;So, what do you do, if you already upgraded (like me)? Well, if you have single monitor and don't play games much, you can probably live with VESA driver.&lt;br/&gt;Otherwise you can temporarily downgrade your X server to the supported version. Here is a short HOWTO:&lt;br/&gt;&lt;ol&gt;&lt;br/&gt;&lt;li&gt;Login as root&lt;br/&gt;&lt;code&gt;su -&lt;/code&gt;&lt;/li&gt;&lt;br/&gt;&lt;li&gt;Add freshrpms repository&lt;br/&gt;&lt;code&gt;rpm -ivh http://ftp.freshrpms.net/pub/freshrpms/fedora/linux/7/freshrpms-release/freshrpms-release-1.1-1.fc.noarch.rpm&lt;/code&gt; &lt;/li&gt; &lt;li&gt;Install ATI proprietary drivers&lt;br/&gt;&lt;code&gt;yum install ati-x11-drv&lt;/code&gt; &lt;/li&gt;&lt;br/&gt;&lt;li&gt;Start ATI even daemon&lt;br/&gt;&lt;code&gt;service atieventsd restart&lt;/code&gt;&lt;/li&gt;&lt;br/&gt;&lt;li&gt;Download and install old version of Xorg server&lt;br/&gt;&lt;code&gt;wget http://ftp.cica.es/fedora/linux/core/test/6.91/Prime/x86_64/os/Fedora/xorg-x11-server-Xorg-1.2.0-6.fc7.x86_64.rpm&lt;br/&gt;rpm -U --force xorg-x11-server-Xorg-1.2.0-6.fc7.x86_64.rpm &lt;/code&gt; &lt;/li&gt;  &lt;li&gt;Uninstall newer Xorg server&lt;br/&gt;&lt;code&gt;rpm -e xorg-x11-server-Xorg-1.3.0.0-5.fc7&lt;/code&gt; &lt;/li&gt; &lt;li&gt;Prevent YUM from upgrading Xorg again&lt;br/&gt;&lt;code&gt;sed '/metadata/aexclude=xorg-x11-server-Xorg*' /etc/yum.conf&lt;/code&gt; &lt;/li&gt; &lt;li&gt;Configure Xorg to use ATI drivers using aticonfig&lt;br/&gt;&lt;ol&gt; &lt;li&gt;CTRL-ALT-F1 to switch to console and login as root&lt;/li&gt; &lt;li&gt;&lt;code&gt;telinit 3&lt;/code&gt; &lt;/li&gt;&lt;li&gt;&lt;code&gt;aticonfig --initial&lt;/code&gt; for single monitor or &lt;code&gt;aticonfig --initial=dual-head&lt;/code&gt; for dual monitors&lt;/li&gt; &lt;li&gt;&lt;code&gt;telinit 5&lt;/code&gt;&lt;/li&gt; &lt;/ol&gt; &lt;/li&gt;&lt;/ol&gt; This is it. At this point you should have proper, 3D accelerated setup.&lt;br/&gt;Most of the directions I have taken and adapted from &lt;a href="http://forums.fedoraforum.org/showthread.php?t=155503&amp;amp;page=1&amp;amp;pp=15"&gt;this thread&lt;/a&gt; at &lt;a href="http://www.fedoraforum.org/"&gt;fedoraforum.org&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;&lt;b&gt;Update:&lt;/b&gt; There is a new release of the ATI drivers that works with Xorg 7.3 (somewhat). It is packaged by both freshrpms and livna and therefore there is no need to downgrade the X server anymore&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-8911212008115102400?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/8911212008115102400/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/06/fedora-7-and-ati-binary-drivers-ugly.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/8911212008115102400'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/8911212008115102400'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/06/fedora-7-and-ati-binary-drivers-ugly.html' title='Fedora 7 and ATI binary drivers. An Ugly Hack.'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-4628762188361325721</id><published>2007-05-11T12:27:00.001-06:00</published><updated>2010-05-12T02:23:14.335-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><title type='text'>Quest for web log analysis software</title><content type='html'>I am currently searching for a web log analysis package for our site. I have to say that the more I look at the available options the more disgusted I get. Basically what I am looking for is wel log analysis software with following features:&lt;br/&gt;&lt;ul&gt; &lt;li&gt;Reading data from web server logs (not using custom javascript to record hits)&lt;/li&gt; &lt;li&gt;Storing log data in a SQL database, so I can use SQL to generate custom reports&lt;/li&gt; &lt;li&gt;Capable of generating custom reports with custom graphs and charts&lt;/li&gt; &lt;li&gt;Capable of reading custom log formats (such as Apache LogFormat strings)&lt;/li&gt; &lt;li&gt;Able to "drill down/zoom in" into the reports for more information&lt;/li&gt; &lt;li&gt;Running on Linux, BSD or Solaris.&lt;br/&gt;&lt;/li&gt; &lt;/ul&gt; It seems that to get all of these is close to impossible. &lt;a name='more'&gt;&lt;/a&gt;Most open source packages and some commercial ones are too primitive and and the ones that do seem to hold some promise are very good at obscuring the actual functionality they provide. The "too primitive" category consists of packages that read in log files and generate a preset number of common reports, such as hits per day, hourly distribution, web browser distribution etc. If this is enough for you, the open source provides adequate&amp;nbsp; solutions.&lt;br/&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.analog.cx/"&gt;Analog&lt;/a&gt; - very configurable. Generates a couple of dozen different reports.&lt;/li&gt;&lt;li&gt;&lt;a href="http://awstats.sourceforge.net/"&gt;Awstats&lt;/a&gt; - similar to analog, the reports are a little nicer. Slightly less configurable, but written in perl therefore should be easy to customize.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.hping.org/visitors/"&gt;Visitors&lt;/a&gt; - a rather primitive, but very fast log analysis program. Incapable of storing log information, therefore needs complete set of logs every time.&lt;/li&gt;&lt;/ul&gt;A little more promising is the package called &lt;a href="http://www.logreport.org/"&gt;Lire&lt;/a&gt;. It is written in perl and distributed under GPL. Unfortunately it is not without its quirks. To say that the documentation is lacking would not be true. There is an extensive user manual. But after digging through the manual even simple questions were left unanswered. How do I import log files manually? How do I generate a report manually? Where are the configuration files? What is the XML schema for the configuration? How do I make a custom log format converter? And so on. I know that documentation was always a weak spot in many open source projects, but in this case, the obstacles are in all the wrong places. As soon as I started to look for something in the docs it wasn't there. On the positive side the source is all there, it is written in perl, there are a lot of included examples and with a bit of patience I was able to figure out how things work. Then the next set of disappointments came through. In order to parse custom log formats you have to write a perl module. And although there is some documentation about the process in the developer's manual, this is a bit too much. I would expect the system to either take apache LogFormat string or to ask for a regex and field descriptions. Both log file importing and report generation are provided by the same script lr_cron. The script is supposed to run from cron and doesn't take any parameters. As far as I understand there is no way to generate a particular report from a particular data store. And log file importing is SLOW. I have tried to import a test log file of a couple million lines and it took close to 4 hours. In a few days I will get a new server with dual dual-core Xeons and 8GB of RAM which should be a bit faster, than the box I used in my tests, but I expect to process 8-10 million lines of logs daily and it is not supposed to take all day.  &lt;br/&gt;&lt;br/&gt;At this point I have decided to turn to commercial solutions. The first few I found were silly windows programs similar in functionality to the ones I mentioned before (analog and friends), but without as much customization. Next batch consisted of hosted solutions that required you to insert pieces of script into your pages and gathered statistics that way. It is a nice technique, but not quite what I had in mind. We already have a solution like that. Unexpectedly, I found it very difficult to figure out if the software in question is hosted or standalone and if it uses logs or scriptlets. It seems that vendors of web analysis software go to extreme measures to hide any and all technical information related to their software. Then I started on the "big boys". And this was even more disappointing than the OSS world.&lt;br/&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="https://www.webtrends.com/"&gt;WebTrends&lt;/a&gt; - one of the most famous solutions on the market. Rumored to be also one of the most expensive and powerful. Unfortunately I was not able to find out. There is no pricing information on the site. To get any information about product features you have to fill out a form with all your information (email, name, address, a bunch of marketing questions etc.) and it will try to sign you up for a few newsletters on the way. The resulting product sheets contain a lot of things about marketing needs of a moder business, but nothing about the actual features of the product. Or at least no technical information. In order to obtain a trial version you have to fill out a form and supposedly a representative will contact you. Only then I found out that the package only runs on windows.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.clicktracks.com/"&gt;ClickTracks Pro&lt;/a&gt; - There is no more information on the site than in case of WebTrends. I have yet to figure out what platforms this software runs on. I have yet to figure out if it is capable of custom reporting and importing custom logs. The pricing info is on the site though. A Pro version license costs $9,344 and there is no trial version. All they offer is to run a trial report on your data. Before I spend 10K on a software package I expect to become completely familiar with all of its features, requirements, quirks etc. I do not see how I can do that in this case.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.omniture.com/products/web_analytics/sitecatalyst"&gt;Omniture SiteCatalyst&lt;/a&gt; - same as the previous two. No information on the site. No trial. And to add insult to injury, the link to the product data sheet actually points to the "Contuct Us" page. Screw you, I will contact you once I know I might want your product. I am not wasting time on your sales pitches before I find out what exactly are you selling.&lt;/li&gt;&lt;li&gt;&lt;a href="http://netinsight.unica.com/"&gt;Unica NetTracker&lt;/a&gt; - a much better experience. It seems the product actually uses log files, works on both Windows and UNIX, uses database to store the data, supports several different database products including MS-SQL and MySQL. There is a trial version, but it needs to be requested. I will be able to tell more once I actually get the trial and play with it. I am putting a lot of hope into this one.&lt;/li&gt;&lt;li&gt;&lt;a href="http://sawmill.net/"&gt;Sawmill&lt;/a&gt; - A nice software package that makes reports out of logs. Reasonably priced. Less geared towards web marketing. There is a downloadable demo and fairly complete documentation. Supports MySQL and internal database. Runs on Windows and Linux. I have been playing with the demo version for tha past couple of days. Custom log formats need to be defined manually by creating a log format definition file. The file format is relatively straight forward and the support is pretty good. The GUI configuration wizard is capable of rendering apache LogFormat strings into a proper log format definition. The MySQL support is unfortunately lacking. The queries are unoptimized to extremes and take forever to complete even on very good hardware with a lot of server level optimizations. Internal database seems reassonable. Log file imports are rather slow and the resulting database takes a lot of space, but reporting capabilities are fantastic. The development team response is very good. I have submitted several bug reports and it is possible that by the time I will have to decide on particular package the database bugs will be fixed. &lt;br/&gt;&lt;/li&gt;&lt;/ul&gt;All in all this research has been pretty disappointing. If anyone has any suggestions as to other solutions I haven't tried or has anything good or bad to say about the products I mentioned you are welcome to do so in the comments.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-4628762188361325721?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/4628762188361325721/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/05/quest-for-web-log-analysis-software.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/4628762188361325721'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/4628762188361325721'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/05/quest-for-web-log-analysis-software.html' title='Quest for web log analysis software'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-7811831412454498371</id><published>2007-04-24T09:50:00.001-06:00</published><updated>2010-05-12T02:22:53.421-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><title type='text'>Web statistics from the command line</title><content type='html'>There are a lot of web statistics packages out there. And some of them are good. To name a few, there is &lt;a href="http://www.analog.cx/"&gt;Analog&lt;/a&gt; (especially when paired with &lt;a href="http://www.reportmagic.org/"&gt;Report Magic&lt;/a&gt;), &lt;a href="http://awstats.sourceforge.net/"&gt;AWStats&lt;/a&gt; and &lt;a href="http://www.hping.org/visitors/"&gt;Visitors&lt;/a&gt;. There are also &lt;a href="http://en.wikipedia.org/wiki/Web_log_analysis_software"&gt;excellent commercial packages&lt;/a&gt; (but they don't pay me to advertise :) ).&amp;nbsp; Most of these have one particular problem. They generate a number of static reports. So if you just want to see how many hits your site received per day during last week they are excellent. Unfortunately if your question is more like "What are the top 10 pages hit by users with Internet Explorer who were referred to us by Google?" all of these programs become rather useless. &lt;a name='more'&gt;&lt;/a&gt;&lt;br/&gt; It is possible to record the web access data into a database and then do complex queries on it, but the site I work for receives millions of hits every day, so handling all those logs would require me to build another big system on top of already existing one just to handle logs. There are probably solutions for problems like this out there. They probably cost a lot and require expensive hardware to install and run.   And our analysts keep bugging me with questions like the one I mentioned. &lt;br/&gt;So, here comes the poor man's custom web statistics. Lets start with the question I asked about the google referred people using IE. For sake of simplicity I will assume the standard apache combined log format. Mind you, that some perl knowledge is assumed&lt;br/&gt;&lt;code&gt;&lt;br/&gt;perl -ne '/MSIE \d/&amp;amp;&amp;amp;/\d{3} (\d+|-) ".*?google\.com.*?"/&amp;amp;&amp;amp;/(POST|GET) (\/.*?) HTTP/&amp;amp;&amp;amp;$s{$2}++;END{foreach $t (keys %s){print "$s{$t}\t$t\n"}}' access.log | sort -gr | head -n 10&lt;br/&gt;&lt;/code&gt;&lt;br/&gt;Confusing? Let me translate a bit. The -n flag causes Perl to go through the lines of the input performing the supplied code for each line. The other flag -e allows to specify the code on command line. Here is an article from &lt;a href="http://newsforge.com/"&gt;NewsForge&lt;/a&gt; with more information on &lt;a href="http://programming.newsforge.com/programming/06/03/08/1456241.shtml?tid=108&amp;amp;tid=91"&gt;using Perl from the command line&lt;/a&gt;. So lets see what we did here. I will properly format the Perl code and comment.&lt;br/&gt;&lt;pre&gt;&lt;br/&gt;# If line has "MSIE " in it we assume the user agent is IE.&lt;br/&gt;# We are using &amp;amp;&amp;amp; everywhere to make sure that if anything fails we stop counting this line&lt;br/&gt;/MSIE \d/ &amp;amp;&amp;amp;&lt;br/&gt;# The referer field in the combined log format goes after the return code and transfered data size&lt;br/&gt;# We are looking for google.com string in the referer field&lt;br/&gt;/\d{3} (\d+|-) ".*?google\.com.*?"/ &amp;amp;&amp;amp;&lt;br/&gt;# The request starts with method and ends with the HTTP version.&lt;br/&gt;# We will catch the requested page by using regex grouping&lt;br/&gt;/(POST|GET) (\/.*?) HTTP/ &amp;amp;&amp;amp;&lt;br/&gt;# Now we use the captured page name as a hash key and add one to the value (count page accesses)&lt;br/&gt;s{$2}++;&lt;br/&gt;# END block is only invoked after everything else was done&lt;br/&gt;END {&lt;br/&gt;    # for each key in the hash&lt;br/&gt;    foreach $t (keys %s) {&lt;br/&gt;        # print the page name and the number&lt;br/&gt;        print "$s{$t}\t$t\n";&lt;br/&gt;    }&lt;br/&gt;}&lt;br/&gt;&lt;/pre&gt;&lt;br/&gt;Then we sort the Perl output using the UNIX sort command and pipe it through head to only print top 10 pages.&lt;br/&gt;So, as you can see the code consists of the following parts:&lt;br/&gt;&lt;ul&gt;&lt;br/&gt;&lt;li&gt;Set of conditions that need to be satisfied to count current line&lt;/li&gt;&lt;br/&gt;&lt;li&gt;The regular expression to store the value we need to count&lt;/li&gt;&lt;br/&gt;&lt;li&gt;The standard ending with the hash and the END block&lt;/li&gt;&lt;br/&gt;&lt;/ul&gt;&lt;br/&gt;Here are a few examples of using regular expressions to match fields in the combined log format&lt;br/&gt;/^\d+\.\d+\.\d+\.\d+/ - matches client IP address&lt;br/&gt;/ \[(.*?)\] "(GET|POST)/ - will store the time stamp in $1&lt;br/&gt;/(GET|POST) \/.*? HTTP/ - matches the request line&lt;br/&gt;/\d{3} (\d+|-) "(.*?)"/ - will store the referrer field in $2&lt;br/&gt;/" (\d{3}) (\d+|-) "/ - will store the return HTTP code in $1&lt;br/&gt;&lt;br/&gt;By using this shortcut you can obtain a lot of interesting statistics from your web logs, without dealing with expensive software and multi gigabyte databases. Here is another command line. What this one does I will leave as an exercise to the reader :)&lt;br/&gt;&lt;code&gt;&lt;br/&gt;perl -ne '/MSIE/||next;/^.*?" "(.*)"$/;$s{$1}++;END{foreach $t (keys %s){print "$s{$t}\t$t\n"}}' access.log | sort -gr | head -n 10&lt;br/&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-7811831412454498371?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/7811831412454498371/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/04/web-statistics-from-command-line.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7811831412454498371'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7811831412454498371'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/04/web-statistics-from-command-line.html' title='Web statistics from the command line'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-1702334366181421082</id><published>2007-03-30T16:42:00.001-06:00</published><updated>2010-05-12T02:22:39.202-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='misc'/><title type='text'>Firefox extensions to install first</title><content type='html'>It happens to all of us sooner or later. My Firefox profile could not bear my continuous abuse and committed suicide without even writing a note. This event, albeit unfortunate, was not unforeseen. I knew, that if I keep switching back and forth between Firefox 1.x and 2.x, install and remove all sorts of suspicious extensions and tinker with about:config settings, I will eventually be punished. So, I assessed the situation and figured that if I am careful I will not lose anything important. I have backed up my corrupted profile, started and shutdown Firefox to create a new one, copied my bookmarks, stored passwords and saved sessions and called it a day. Once I started Firefox again though it still didn't look friendly, so I started adding extensions. Here is my list ordered by importance.&lt;br/&gt;&lt;br/&gt;&lt;b&gt;What did I install:&lt;br/&gt;&lt;/b&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1122"&gt;Tab Mix Plus&lt;/a&gt; - Is only the best tab manager extension I have seen so far. It makes tab switching behave in a logical manner (like windows on alt-tab and not in a dumb loop) it adds a lot of useful tab related functions such as lock tab or duplicate tab. Locking is a way to make sure that wherever you click this tab stays on the same page and links are opened in new tabs, this is highly useful for browsing lists of things, be that google search results, bookmarks or craigslist.org listings. Also Tab MIx Plus replaces the built-in Firefox two feature of crash recovery and turns it into a complete session management. You can save and restore multiple sessions including closed tabs and windows (oh, did I mention that you can undo tab close with Tab Mix Plus?) and other information.&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1865"&gt;Adblock Plus&lt;/a&gt; and &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1136"&gt;Adblock Filterset.G Updater&lt;/a&gt; - Unless you are a masochist and enjoy intrusive advertising you need these extensions. Yes, you really do. This extension effectively bloxk most forms of banners, flash ads, popups (even the ones built-in popup blocker doesn't catch) etc. The updater will download current set of patterns, so you don't have to train the blocker yourself and will keep it updated.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://del.icio.us/help/firefox/extension"&gt;del.icio.us firefox extension&lt;/a&gt; - A very convenient way to keep your bookmarks online. Includes a "Bookmark This" button that will open a new window allowing you to tag, describe and save current page.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1811"&gt;Deepest Sender&lt;/a&gt; - There are a few blogging extensions out there that allow you to post blog entries in a comfortable (or not so comfortable in some cases) way. I have chosen Deepest Sender as my personal favorite. It supports all the major blog engines (in my case Live Journal, Blogger and WordPress), allows for simple formating, allows direct source editing and has a simple preview. I guess I would prefer a few more WordPress specific options, but I have yet to find a better blogging solution.&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1368"&gt;Colorful Tabs&lt;/a&gt; - All this extension does is paint your tabs carious semi-random colors (the colors cannot be assigned, but it will make sure that no two neighboring tabs are the same color) and slightly fades away tabs which are out of focus. You cannot imagine without trying just how much easier it is to navigate multiple tabs with this extension. Albeit your tab bar starts to look much less officious.&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/748"&gt;GreaseMonkey&lt;/a&gt; - a generic extension allowing you to execute custom JavaScript scripts on pages you choose. Using these scripts you can enhance usability of popular sites, add missing features, change look and feel etc. Pre-made scripts can be downloaded from &lt;a href="http://userscripts.org/"&gt;UserScripts.Org&lt;/a&gt; site.&lt;br/&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/60"&gt;Web Developer&lt;/a&gt; and &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1843"&gt;FireBug&lt;/a&gt; - The first one is the web developer's multi-tool. It is a tool bar that includes all features that you could possibly want when testing the web site you are working on. Cache disabling, headers, authentication, security and other information, window resizes for different resolution simulation, element outlines etc. etc. etc. And where Web Developer leaves off, FireBug comes in. Normally hiding in the status bar icon FireBug will tell you exact lines in CSS that affect particular tag, tag that corresponds to particular element, how long it took to load and render any of the page requirements, what scripts have been loaded and much, much more.&lt;/li&gt;&lt;/ul&gt;There are several extensions I didn't install because I personally didn't find them useful, but which should still be mentioned&lt;strong&gt;&lt;/strong&gt;. &lt;br/&gt;&lt;b&gt;&lt;br/&gt;What I didn't install:&lt;/b&gt;&lt;br/&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/77"&gt;Sage&lt;/a&gt; - is the most popular RSS reader extension. I do not use it, because I don't like side bars and I am quite happy with my external RSS reader which happens to be &lt;a href="http://liferea.sourceforge.net/"&gt;Liferea&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1027"&gt;All-in-One Sidebar&lt;/a&gt; - is a great tool for people who use side bar a lot. It integrates downloads, extensions, source view and other features into the side bar and allows for custom side bar panels.&lt;/li&gt;&lt;li&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1730"&gt;ScribeFire&lt;/a&gt; - is another popular blogger extension. It even supports some WordPress features better than Deepest Sender, but the interface is a little cumbersome and the Live Journal support is very buggy.&lt;/li&gt;&lt;/ul&gt;So, at this my browser is ready for action again. I will be back soon with Firefox extensions for web site testing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-1702334366181421082?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/1702334366181421082/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/03/firefox-extensions-to-install-first.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/1702334366181421082'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/1702334366181421082'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/03/firefox-extensions-to-install-first.html' title='Firefox extensions to install first'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-5672448211089667421</id><published>2007-04-19T16:02:00.001-06:00</published><updated>2010-05-12T02:22:18.877-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='misc'/><title type='text'>First look at Thunderbird 2</title><content type='html'>As most of you already know, &lt;a href="http://www.mozilla.com/en-US/thunderbird/"&gt;Thunderbird 2.0&lt;/a&gt; was &lt;a href="http://www.mozilla.com/en-US/thunderbird/2.0.0.0/releasenotes/"&gt;released today&lt;/a&gt;. I have been running the 2.0 release candidate for some time now, so I can share my opinions of the new version, while the going is still hot.&lt;br/&gt;&lt;h2&gt;Good Stuff&lt;/h2&gt;&lt;br/&gt;&lt;dl&gt;&lt;br/&gt;&lt;dt&gt;&lt;b&gt;New default theme and icons&lt;/b&gt;&lt;/dt&gt;&lt;br/&gt;&lt;dd&gt;I have found both new icon theme and the new user interface controls theme to be slightly better looking. There are no major changes here, just everything looks a little bit crispier, a little less intrusive, a little better organized and a little aesthetically more pleasing.&lt;br/&gt;&lt;dt&gt;&lt;b&gt;Unlimited tags&lt;/b&gt;&lt;/li&gt;&lt;br/&gt;&lt;dd&gt;This is not as much a new feature as a fix of an old bug. Older versions of Thunderbird used to allow you to tag messages using either manual tagging or filters. Tagged message would be colored into particular color, so you can at a glance find out what emails you have received or what is left to do in your inbox. Unfortunately at the same time previous versions of Thunderbird would kill this feature by providing a fixed set of five pre-made tags (you could edit the labels, but you couldn't add your own). The new version still defines the same set of five tags for backward compatibility, but will happily allow you to add any number of your own. You can easily tag your messages by hand with the first nine tags in your list by pressing number keys and you can define message filters to tag messages with particular tags.&lt;/dd&gt;&lt;br/&gt;&lt;dt&gt;&lt;b&gt;New Gecko Engine features&lt;/b&gt;&lt;/dt&gt;&lt;br/&gt;&lt;dd&gt;Since new Thunderbird is based on the same version of Gecko (the rendering engine under Mozilla products) as Firefox 2, it inherits some features from it. Spelling checks while you type, auto-completions etc.&lt;/dd&gt;&lt;br/&gt;&lt;dt&gt;&lt;b&gt;New mail notification&lt;/b&gt;&lt;/dt&gt;&lt;br/&gt;&lt;dd&gt;The new version is able to notify you about incoming mail by either playing a sound or flashing a small pop-up (self-destructing in a few seconds) with subjects and senders of new messages.&lt;/dd&gt;&lt;br/&gt;&lt;dt&gt;&lt;b&gt;Better support of large IMAP folders&lt;/b&gt;&lt;/dt&gt;&lt;br/&gt;&lt;dd&gt;Thunderbird 1.x used to consistently crash on me when I tried to manipulate 10K+ messages IMAP folders with it. Thunderbird 2 seems not to notice the difference between a 15K messages in a folder and 15 messages in a folder.&lt;/dd&gt;&lt;br/&gt;&lt;/dl&gt;&lt;br/&gt;&lt;h2&gt;Bad Stuff&lt;/h2&gt;&lt;br/&gt;&lt;dl&gt;&lt;br/&gt;&lt;dt&gt;&lt;b&gt;Finer customizations (they are there... but they are not)&lt;/b&gt;&lt;/dt&gt;&lt;br/&gt;&lt;dd&gt;Something bit me to customize the "such and such wrote" message that appears on the top of quoted message in your replies. And to my surprise, to do this you need edit some obscure configuration files in Thunderbird profile directory. Yes, it is documented extensively on the &lt;a href="http://www.mozilla.org/support/thunderbird/tips"&gt;Tips and Tricks&lt;/a&gt; page, but I think this would not sit well with a casual user. Same goes for many other features that Thunderbird has, but you will never find out about them unless somebody tells you.&lt;/dd&gt;&lt;br/&gt;&lt;dt&gt;&lt;b&gt;Some icons are inconsistent with previous releases&lt;/b&gt;&lt;/dt&gt;&lt;br/&gt;&lt;dd&gt;Took me some time to get used to the new junk mail icon. Not a big deal though.&lt;/dd&gt;&lt;br/&gt;&lt;dt&gt;&lt;b&gt;Still no "Reply to All" shortcut of any sort&lt;/b&gt;&lt;/dt&gt;&lt;br/&gt;&lt;dd&gt;This is especially annoying when you are trying to CC on some of your business correspondence to some people (say your boss and your team) and every time you reply to a message you cannot just hit CTRL-R or some other key, but actually need to go through the menu to catch all the addresses in the original message. I suppose there has to be an extension for this somewhere, but so far I couldn't find it.&lt;br /&gt;&lt;b&gt;Update:&lt;/b&gt; Ctrl-Shift-R does reply all. I should have RTFM'd more&lt;/dd&gt;&lt;br/&gt;&lt;/dl&gt;&lt;br/&gt;&lt;h2&gt;Conclusions&lt;/h2&gt;&lt;br/&gt;&lt;ul&gt;&lt;br/&gt;&lt;li&gt;If you are already using Thunderbird, you should strongly consider upgrading. The new Thunderbird is leaner, meaner, faster and with sharper teeth :) The only reason to wait is if you are using some specific extensions not yet available for the new version&lt;/li&gt;&lt;br/&gt;&lt;li&gt;If you are not using Thunderbird and you do not require Outlook-like abilities such as calendar, to do lists, exchange compatibility etc., but only use your mail client to send and read email you should definitely consider giving Thunderbird a try.&lt;/li&gt;&lt;br/&gt;&lt;li&gt;The general feeling about the new Thunderbird is that it is not a huge leap forward, compared to previous versions, but a lot of small useful improvements making the overall experience of using it a much more pleasant one.&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-5672448211089667421?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/5672448211089667421/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/04/first-look-at-thunderbird-2.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/5672448211089667421'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/5672448211089667421'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/04/first-look-at-thunderbird-2.html' title='First look at Thunderbird 2'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-8659802139194902553</id><published>2007-03-28T10:12:00.001-06:00</published><updated>2010-05-12T02:21:58.722-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><title type='text'>A FreeBSD experiment</title><content type='html'>About a year back, there have been some activity around a &lt;a href="http://news.com.com/FreeBSD+vows+to+compete+with+desktop+Linux/2100-1011_3-6071598.html?tag=nefd.top"&gt;post by one of the FreeBSD developers&lt;/a&gt;&amp;nbsp; regarding FreeBSD being ready to compete with Linux (and I suppose by proxy with Windows) as a desktop system. Back then I wanted to play around with FreeBSD once again (my friendship with UNIX started with installing FreeBSD 2.2.4 on my home computer), but found some features lacking for a proper support of my favorite UNIX desktop (that would be GNOME). A few days ago I figured it was a good time to take a look at what the BSD people came up with in the desktop department. I have done some probe installs in VMware, so now I am ready to try it on my home computer. So far (after those test installs) I figured out two main things about FreeBSD. &lt;br/&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;A lot of things are very different from Linux.&lt;/strong&gt;&lt;/li&gt;&lt;br/&gt;Well, this would be natural and expected, since FreeBSD is not Linux. But it has been a long time since a new system confused me and now I am refreshingly confused. The aspect I found especially confusing is disk allocation. I still hope to find a reasonable documentation on what the relation is between partitions, slices and labels is and how the information about the layout is stored etc. Since all of base system in FreeBSD is developed as part of the FreeBSD project, a lot of basic commands work in unexpected ways. This is not a problem though, I was ready for it and now I seem to cope well with the differences.&lt;br/&gt;&lt;li&gt;&lt;strong&gt;The community is extremely rude to new users.&lt;/strong&gt;&lt;/li&gt;&lt;br/&gt;This, unfortunately, is a problem I didn't expect. For years of working with Linux, I have gotten used to people being willing to help and if not at least not being outright evil. Not so in FreeBSD world. On one of the test installs, I messed up my disks by trying to switch to a different boot manager. I couldn't boot my system and I didn't want to reinstall, since I have configured and installed and compiled a lot of stuff on it. So, being a newbie I went to #freebsd channel and asked for help. To my surprise, I was immediately told that the only way for me was to reinstall entire system. I have expressed some doubts about this, since I was pretty sure that my data was still intact on the system, but was told again, that the only way was to reinstall and restore from backup if I had one. At this point I figured that this was a big usability hole for a modern operating system, but I figured that I will get a second opinion before I destroy my data. Some 10-15 minutes later, some other channel member took pity on me and told me that the reinstall was only suggested because I was on a wrong channel. I was supposed to ask for help on #freebsdhelp. I went to that channel and while my question was ignored for a while, I kept digging through man pages and mailing lists and other documentation and found my answer. By that time, someone on #freebsdhelp told me to shut up because I didn't use proper terms for disk allocation units. If I wasn't stubborn and didn't have enough prior computer knowledge, at this point I would be reinstalling my system from scratch. Why? Because I asked a question on a wrong channel. Mind you that the "right" channel jst plain ignored my question, which, while being better than the previous experience, also didn't help much. I am still going to try FreeBSD. Albeit I doubt I will ever ask for help from anybody in FreeBSD community.&lt;br/&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-8659802139194902553?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/8659802139194902553/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/03/freebsd-experiment.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/8659802139194902553'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/8659802139194902553'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/03/freebsd-experiment.html' title='A FreeBSD experiment'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-7750182289807733408</id><published>2007-03-22T13:51:00.001-06:00</published><updated>2010-05-12T02:21:43.604-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><title type='text'>Fun with MySQL query optimizer</title><content type='html'>A few days ago, after a minor bug fix in our site code, suddenly, the load on the database server dropped about 50%. I was obviously interested in what caused such a major improvement and found out a few interesting things. To demonstrate, this fascinating phenomenon, lets create a database:&lt;br/&gt;&lt;pre&gt;CREATE DATABASE test;&lt;/pre&gt;&lt;br/&gt;create a table&lt;br/&gt;&lt;pre&gt;CREATE TABLE `table1` (&lt;br/&gt;`f1` int(11) NOT NULL auto_increment,&lt;br/&gt;`f2` char(10) NOT NULL,&lt;br/&gt;PRIMARY KEY  (`f1`),&lt;br/&gt;KEY `an_index` (`f2`)&lt;br/&gt;);&lt;/pre&gt;&lt;br/&gt;and populate this table with some values&lt;br/&gt;&lt;pre&gt;DELIMITER $$&lt;br/&gt;DROP PROCEDURE IF EXISTS `test`.`populate_table1`$$&lt;br/&gt;CREATE PROCEDURE `test`.`populate_table1` (ct INT)&lt;br/&gt;BEGIN&lt;br/&gt;PREPARE q1 FROM 'INSERT INTO table1 (f2) SELECT ?';&lt;br/&gt;SET @x = 0;&lt;br/&gt;REPEAT&lt;br/&gt;EXECUTE q1 USING @x;&lt;br/&gt;SET @x = @x + 1;&lt;br/&gt;UNTIL @x &amp;gt;= ct&lt;br/&gt;END REPEAT;&lt;br/&gt;END$$&lt;br/&gt;DELIMITER ;&lt;br/&gt;CALL populate_table1(100000);&lt;/pre&gt;&lt;br/&gt;And now the evil magic begins (query results skipped for brevity)&lt;br/&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br/&gt;&lt;pre&gt; mysql&amp;gt; SELECT COUNT(*) FROM table1&lt;br/&gt;WHERE f2 IN ("100", "101", "102", "103", "104", "105", "106",&lt;br/&gt;"107", "108", "109", "110", "111", "112", "113", "114", "115",&lt;br/&gt;"116", "117", "118", "119", "120", "121", "122", "123", "124",&lt;br/&gt;"125", "126", "127", "128", "129", "130", "131", "132", "133",&lt;br/&gt;"134", "135", "136", "137", "138", "139", "140", "141", "142",&lt;br/&gt;"143", "144", "145", "146", "147", "148", "149", "150", "151",&lt;br/&gt;"152", "153", "154", "155", "156", "157", "158", "159", "160",&lt;br/&gt;"161", "162", "163", "164", "165", "166", "167", "168", "169",&lt;br/&gt;"170", "171", "172", "173", "174", "175", "176", "177", "178",&lt;br/&gt;"179", "180", "181", "182", "183", "184", "185", "186", "187",&lt;br/&gt;"188", "189", "190", "191", "192", "193", "194", "195", "196",&lt;br/&gt;"197", "198", "199", "200");&lt;br/&gt;+----------+&lt;br/&gt;| COUNT(*) |&lt;br/&gt;+----------+&lt;br/&gt;|      101 |&lt;br/&gt;+----------+&lt;br/&gt;1 row in set (0.00 sec)&lt;br/&gt;mysql&amp;gt; SELECT COUNT(*) FROM table1&lt;br/&gt;WHERE f2 IN ("100", "101", "102", "103", "104", "105", "106",&lt;br/&gt;"107", "108", "109", "110", "111", "112", "113", "114", "115",&lt;br/&gt;"116", "117", "118", "119", "120", "121", "122", "123", "124",&lt;br/&gt;"125", "126", "127", "128", "129", "130", "131", "132", "133",&lt;br/&gt;"134", "135", "136", "137", "138", "139", "140", "141", "142",&lt;br/&gt;"143", "144", "145", "146", "147", "148", "149", "150", "151",&lt;br/&gt;"152", "153", "154", "155", "156", "157", "158", "159", 160,&lt;br/&gt;"161", "162", "163", "164", "165", "166", "167", "168", "169",&lt;br/&gt;"170", "171", "172", "173", "174", "175", "176", "177", "178",&lt;br/&gt;"179", "180", "181", "182", "183", "184", "185", "186", "187",&lt;br/&gt;"188", "189", "190", "191", "192", "193", "194", "195", "196",&lt;br/&gt;"197", "198", "199", "200");&lt;br/&gt;+----------+&lt;br/&gt;| COUNT(*) |&lt;br/&gt;+----------+&lt;br/&gt;|      101 |&lt;br/&gt;+----------+&lt;br/&gt;1 row in set (0.16 sec)&lt;/pre&gt;&lt;br/&gt;Do you see the difference? Probably not. But it just made the query 16 times slower! I spent a long time finding that one (and in our case I was looking at endless lists of zip codes). Let me illustrate.&lt;br/&gt;&lt;pre&gt; mysql&amp;gt; SELECT COUNT(*) FROM table1&lt;br/&gt;WHERE f2 IN (100, 101, 102, 103, 104, 105, 106, 107, 108, 109,&lt;br/&gt;110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122,&lt;br/&gt;123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135,&lt;br/&gt;136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148,&lt;br/&gt;149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161,&lt;br/&gt;162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174,&lt;br/&gt;175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187,&lt;br/&gt;188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200);&lt;br/&gt;+----------+&lt;br/&gt;| COUNT(*) |&lt;br/&gt;+----------+&lt;br/&gt;|      101 |&lt;br/&gt;+----------+&lt;br/&gt;1 row in set (0.16 sec)&lt;/pre&gt;&lt;br/&gt;Right. Type casting. In the second query I unquoted one of the values in the list in the list. Namely the value "160". Now to the point. I was a bit frustrated and I couldn't believe that a cast of an INT to a CHAR would take so long and it doesn't. A simple explain shows what is wrong. We are not using indexes properly.&lt;br/&gt;&lt;pre&gt; mysql&amp;gt; explain SELECT COUNT(*) FROM table1&lt;br/&gt;WHERE f2 IN (&amp;lt;badly quoted list&amp;gt;)G&lt;br/&gt;*************************** 1. row ***************************&lt;br/&gt;id: 1&lt;br/&gt;select_type: SIMPLE&lt;br/&gt;table: table1&lt;br/&gt;type: index&lt;br/&gt;possible_keys: an_index&lt;br/&gt;key: an_index&lt;br/&gt;key_len: 10&lt;br/&gt;ref: NULL&lt;br/&gt;rows: 100000&lt;br/&gt;Extra: Using where; Using index&lt;br/&gt;1 row in set (0.00 sec)&lt;br/&gt;mysql&amp;gt; explain SELECT * FROM table1&lt;br/&gt;WHERE f2 IN (&amp;lt;well quoted list&amp;gt;)G&lt;br/&gt;*************************** 1. row ***************************&lt;br/&gt;id: 1&lt;br/&gt;select_type: SIMPLE&lt;br/&gt;table: table1&lt;br/&gt;type: range&lt;br/&gt;possible_keys: an_index&lt;br/&gt;key: an_index&lt;br/&gt;key_len: 10&lt;br/&gt;ref: NULL&lt;br/&gt;rows: 116&lt;br/&gt;Extra: Using where; Using index&lt;br/&gt;1 row in set (0.01 sec)&lt;/pre&gt;&lt;br/&gt;As you can see in the first case (with a type cast) we look through all the 100K values, where in the second case (with quoting fixed) we only look through 116. That is where the performance goes. But why? I asked some people from MySQL AB (good thing we have a support contract). And the answer I got was that it is difficult to use indexes on CHAR fields after a cast from INT, since you can comapre a CHAR to an INT in more than one way. Wow! A quick check in MySQL confirmed my nightmares.&lt;br/&gt;&lt;pre&gt;&lt;br/&gt;mysql&amp;gt; select 1 = 1;&lt;br/&gt;+-------+&lt;br/&gt;| 1 = 1 |&lt;br/&gt;+-------+&lt;br/&gt;|     1 |&lt;br/&gt;+-------+&lt;br/&gt;1 row in set (0.00 sec)&lt;br/&gt;mysql&amp;gt; select 1 = "1";&lt;br/&gt;+---------+&lt;br/&gt;| 1 = "1" |&lt;br/&gt;+---------+&lt;br/&gt;|       1 |&lt;br/&gt;+---------+&lt;br/&gt;1 row in set (0.00 sec)&lt;br/&gt;mysql&amp;gt; select 1 = "1  ";&lt;br/&gt;+-----------+&lt;br/&gt;| 1 = "1  " |&lt;br/&gt;+-----------+&lt;br/&gt;|         1 |&lt;br/&gt;+-----------+&lt;br/&gt;1 row in set (0.00 sec)&lt;br/&gt;mysql&amp;gt; select 1 = "1asdf";&lt;br/&gt;+-------------+&lt;br/&gt;| 1 = "1asdf" |&lt;br/&gt;+-------------+&lt;br/&gt;|           1 |&lt;br/&gt;+-------------+&lt;br/&gt;1 row in set, 1 warning (0.00 sec)&lt;/pre&gt;&lt;br/&gt;Whaaa!!! So that explains why MySQL finds it "difficult" to use indexes after a type cast. The question that remains is what MySQL team was smoking when they implemented string comparison in this creative new way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-7750182289807733408?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/7750182289807733408/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/03/fun-with-mysql-query-optimizer.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7750182289807733408'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7750182289807733408'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/03/fun-with-mysql-query-optimizer.html' title='Fun with MySQL query optimizer'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-7394638316290993732</id><published>2007-03-21T13:40:00.000-06:00</published><updated>2010-05-12T02:21:03.315-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>Web security and the autocomplete attribute</title><content type='html'>In this day and age, most web browsers offer some sort of a form autocompletion feature. IE, firefox, opera, saphari all have it. The browser will offer you to store your logon information and common form fields such as first name or address and fill them into appropriate forms. As far as I am concerned this feature was a blessing for the web security. Now, you would think, a user can choose proper, secure passwords for his various web sites without writing them down on sticky notes and without reusing the same ones over and over again.&lt;br/&gt;&lt;br/&gt;Unfortunately this is not entirely true, since not one browser I know would allow for an easy and obvious way to backup the passwords and an easy and obvious way to use these from a portable medium such as a USB key or a mini CD. I know, that with a bit of skill, you can export IE and opera settings and backup firefox profile, but who is going to bother? This should be something the browser does for you transparently, for people to actually get into a habit of using such a feature properly. But back to the point. In Microsoft implementation, the autocomplete doesn't automatically fill the fields. If you go to a login page of some site it will present you with a choice of usernames and once you pick a username it will fill in the password if it was stored. Not very secure. Especially considering that by default every time you fill a form IE will prompt you to turn autocomplete on and once it is on, there is no indication, that your username is being stored. It will prompt for password. So, on a publicly accessible computer this feature becomes a privacy and security horror?&lt;br/&gt;&lt;br/&gt;Not really. If you do not turn this feature on by default and do not bug user about it, no one in their right mind will turn it on on a public computer and a nicely evil  restrictive user policy will help against those not in their right mind. Firefox also has a similar feature, where it will ask to remember your username and password, but the default answer is no and it will not remember your username by default without asking. So, what is the Microsoft solution? In IE 5.5, a new feature creeps in. The autocomplete attribute. Now anybody who is trying to design a login form can turn off the autocomplete for a particular form, or even for a particular control. Firefox, in a fit of moronic exitement, follows suite and implements support for this new non-standard attribute without thinking about consequences. I am fairly sure Opera recognizes autocomplete as well. So, for example, when browsing Chase online banking site you are not going to be prompted to remember the password you are entering. So, this is a good thing. You do not want to leave your bank account wide open, do you? No, I am not. But this is not a good thing. Why? Because it takes control over security from me, the user. Now, Chase Manhatten Bank decides that my bank login information is sensitive enough not to store it anywhere and not me. A lot of sites follow suite and use autocomplete attribute left and right without any regard to the actual risks of user accounts falling into the wrong hands. Some web site analyzer programs actually throw out a warning if they find a password field without autocomplete=off. So, now I cannot decide for myself if particular piece of information is important or not, this has been decided for me.&lt;br/&gt;&lt;br/&gt;Promoting personal freedoms is a good thing, but do these restrictions actually help. Lets imagine a scenario where it helps. A user comes to an internet caffe and goes to check his bank account. When prompted to save his account information on a public computer that doesn't belong to him he inexplicably clicks yes (and mind you that just hitting enter wouldn't help, since even IE doesn't choose to store the password by default) and walks away. The next person goes to the same computer, happens to be an evil bandit, finds his username and voila all his money is gone, all his base are belong to us and the bandit is in his base killing his dudes. Well, this is bad, so instead we never offer him to remember his password. So, to accomodate to the fact that he has to remember his passwords once again by himself, the user in our example and a lot of other users revert back to the ways I described at the beginning of this post. They use the same password everywhere (or two, or three), they choose simple, easy to remember (and easy to crack) passwords, they leave sticky notes with passwords on the monitor etc. etc. etc. So to protect some shmuck who doesn't know that fire burns and guns kill MS have inconvenienced a lot of reasonable people into lowering their defenses. Is that still a Good Thing?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-7394638316290993732?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/7394638316290993732/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/03/web-security-and-autocomplete-attribute.html#comment-form' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7394638316290993732'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7394638316290993732'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/03/web-security-and-autocomplete-attribute.html' title='Web security and the autocomplete attribute'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-2972744063094607618</id><published>2007-04-09T11:18:00.000-06:00</published><updated>2010-05-12T02:21:03.313-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='security'/><title type='text'>Freedom vs. accountability in system administration</title><content type='html'>One of the standard security measures on a contemporary UNIX system is sudo command. For those unfamiliar with it sudo allows a user to run commands under privileges of another user, so for example a regular user can run a command as root. This, at the first glance, seems very similar to su, but sudo allows a very fine configuration of what exact commands are allowed to be run by what user and coming from what host and sudo, as opposed to su, doesn't require the user to know root password. Also, sudo will log every use of itself, weather succesful or failed therefore leaving an audit trail of administration command used on the system. Sudo is exceptionally good, for giving regular users fragments of root power where they need it. For example using sudo you can give your developers rights to restart development database server or development web server or give them rights to use network sniffers etc. One of the other things sudo seems to be good for is to record actions taken by system administrators, for accountability purposes. It all seems very simple&lt;br/&gt;&lt;ul&gt;&lt;li&gt;Create regular users for every administrator&lt;/li&gt;&lt;li&gt;Configure sudo to allow administrators run any command as rot using sudo&lt;/li&gt;&lt;li&gt;Disable the actual root logon&lt;/li&gt;&lt;/ul&gt;And voile, every time one of the administrators does something that requires root privileges, he is forced to use sudo and his exact command line is logged for potential future audit. Or that would be the idea. Unfortunately there are two things that prevent this from being an administration audit panacea. Namely, &lt;pre&gt;sudo /bin/bash&lt;/pre&gt; and &lt;pre&gt;sudo vim /var/log/secure&lt;/pre&gt;, where the first one will run interactive root shell (allowing one to start running commands as root directly from the shell without any logging) and the second one starts editor on the sudo audit log (log name may be different on different systems) allowing to delete or edit any audit lines one deems unsightly (for example change your user name to somebody else's in that line that says rm -rf /oracle :) ). What are the ways to prevent this?&lt;br/&gt;&lt;ul&gt;&lt;li&gt;Exclude potentially dangerous commands such as command shell and editor without arguments from the sudo config&lt;br/&gt;&lt;/li&gt;&lt;li&gt;Set a strict list of administration commands that is allowed for execution by administrators&lt;/li&gt;&lt;li&gt;Use external auditing mechanisms such as auditd daemon&lt;br/&gt;&lt;/li&gt;&lt;li&gt;Use external privilege restriction mechanisms such as SELinux.&lt;/li&gt;&lt;/ul&gt;The first way is obviously bad. This is a classic example of &lt;a href="http://www.ranum.com/security/computer_security/editorials/dumb/"&gt;"enumerating badness"&lt;/a&gt; where you are trying to enumerate every pattern you are trying to catch instead of enumerating every pattern you do not want to catch. Also, this approach is just plain impossible to implement, since there are too many ways to run a shell or an editor without triggering the sudo restrictions you might impose. The second way might work somewhat in a big shop where each administrator is given a particular piece of the system to work with, so web administrator is setup to run web server administration commands and nothing else and database administrator only has access to database administartion etc. Unfortunately this approach also has its faults. For one, somebody has to have full access to the system, at least so that sudo configuration can be changed when staff moves around. Also, in situations such as debugging a difficult to catch problem on the server an administrator may benefit greatly from access to unusual tools and such use can be difficult to predict. Third and fourth way are definitely worth loking at and probably worth implementing, but discussion is a bit out of scope of this article. I will make write another article someday on administration of SELinux and auditing with auditd some other day. Returning to uses of sudo, the question is where you want to draw the line between the convenience and freedom of action of your system administration staff and having a trustworthy audit trail. In big companies this question has only one answer and that is "we want to have a trusted audit information no matter at what cost" while in smaller shops, accountability may be less of a concern due to more trustful relationships between the staff and sudo logs may be enough for a basic "who did what to the system" logging.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-2972744063094607618?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/2972744063094607618/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/04/freedom-vs-accountability-in-system.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/2972744063094607618'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/2972744063094607618'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/04/freedom-vs-accountability-in-system.html' title='Freedom vs. accountability in system administration'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-5019912194109114965</id><published>2007-04-24T12:26:00.000-06:00</published><updated>2010-05-12T02:21:03.312-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><title type='text'>Blogs are offensive</title><content type='html'>According to the report created by ScanSafe, 80% of all blogs contain "offensive" and/or "unwanted" content. I haven't read the report myself, but according to the &lt;a href="http://arstechnica.com/news.ars/post/20070424-report-80-percent-of-blogs-contain-offensive-content.html"&gt;post about it&lt;/a&gt; at &lt;a href="http://arstechnica.com/index.ars"&gt;Ars Technica&lt;/a&gt;, it is enough for a blog to have one instance of one of the "bad words" to be considered offensive. I suppose this is one of the rare cases where I prefer to stick with majority. Fuck, fuck, fuck.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-5019912194109114965?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/5019912194109114965/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/04/blogs-are-offensive.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/5019912194109114965'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/5019912194109114965'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/04/blogs-are-offensive.html' title='Blogs are offensive'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6138855974683141265.post-7571078491078309977</id><published>2007-07-31T09:10:00.000-06:00</published><updated>2010-05-12T02:21:03.310-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><title type='text'>Success of Ubuntu</title><content type='html'>I think that the existence of &lt;a href="http://www.thejemreport.com/mambo/content/view/340"&gt;this blog post&lt;/a&gt; is a clear indication that Linux is succeeding on the Desktop :)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6138855974683141265-7571078491078309977?l=blog.mitechki.net' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.mitechki.net/feeds/7571078491078309977/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://blog.mitechki.net/2007/07/success-of-ubuntu.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7571078491078309977'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6138855974683141265/posts/default/7571078491078309977'/><link rel='alternate' type='text/html' href='http://blog.mitechki.net/2007/07/success-of-ubuntu.html' title='Success of Ubuntu'/><author><name>Heavy Battle Wombat</name><uri>http://www.blogger.com/profile/06845772522703289167</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00436648329331425664'/></author><thr:total>0</thr:total></entry></feed>