Thursday, April 29, 2010
Resuming posting
This blog has been on a hiatus for a while, mostly because I was busy or lazy or both. Now I will try and resume occasional posting. I think, I will start with some posts on switching from VIM to Emacs (as if that has never been blogged before) and setting up and using Clojure (same for this). And than I will see where that takes me.
Thursday, July 31, 2008
crontab to english translator
A couple of years ago I have written this script, that takes crontab entries from standard input, parses them and prints english translation. It is definitely not perfect and will bail at a lot of valid crontab entries, but for all it is worth here it is.
#!/usr/bin/python
import re
import os
import sys
import string
class CronJob:
"""A class describing a scheduled job."""
def __init__(self, str):
"""
Generate a new object from a crontab line. We should differentiate between the following types of crontabs:
1. something = something (raise exception)
2. (classic cron shedule)
3. [!&]word(arg)[,word(arg)...] (fcron style schedule)
4. #somestuff (comment, raise exception)
5. (empty line, raise exception)
"""
if re.compile("^\s*$").search(str):
raise NotACronJobError("EMPTY")
elif re.compile("^\s*#").search(str):
m = re.compile("^\s*#(.*)").search(str)
raise NotACronJobError("COMMENT", m.group(1))
elif re.compile("^\s*\S+\s*=.+").search(str):
m = re.compile("^\s*(\S+?)\s*=\s*(.+)").search(str)
raise NotACronJobError("VARIABLE", m.group(1), m.group(2))
elif re.compile("^(\*|\d+)").search(str) or re.compile("^[!&]\w+").search(str):
if re.compile("^!.+?\)\s*$").search(str): raise NotACronJobError("GARBAGE", str)
self._parseLine(str)
return
else:
raise(NotACronJobError("GARBAGE", str))
def _parseLine(self, str):
if re.compile("^[!&]\w+").search(str):
self.type = "fcron"
m = re.compile("^\S+\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(.+)").search(str)
else:
self.type = "vixie"
m = re.compile("^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(.+)").search(str)
self.min = self._parseDateTime(m.group(1), "min")
self.hr = self._parseDateTime(m.group(2), "hr")
self.dom = self._parseDateTime(m.group(3), "dom")
self.mon = self._parseDateTime(m.group(4), "mon")
self.dow = self._parseDateTime(m.group(5), "dow")
self.cmd = self._parseCmd(m.group(6))
def _parseDateTime(self, dt, type):
min = range(0,59)
hr = range(0,23)
dom = range(1,31)
mon = range(1,12)
dow = range(0-7)
if dt == "*":
return None
elif re.compile("^\d+$").search(dt):
return range(int(dt),int(dt) + 1)
elif re.compile(",").search(dt):
dts = dt.split(",")
parsed = [self._parseDateTime(x, type) for x in dts]
res = []
for x in parsed:
if res == None: res = []
res = res.extend(x)
return res
elif re.compile("\/").search(dt):
m = re.compile("(.+?)/(.+)").search(dt)
r = m.group(1)
st = m.group(2)
if r == "*":
r = eval(type)
else:
(x,y) = r.split("-")
r = range(int(x),int(y))
return range(r[0], r[-1], int(st))
elif re.compile("-").search(dt):
m = re.compile("(\d+)-(\d+)").search(dt)
return range(int(m.group(1)),int(m.group(2)))
else:
raise NotACronJobError("GARBAGE", dt)
def _parseCmd(self, cmd):
if re.compile("^\s*root\s*").search(cmd):
cmd = re.compile("^\s*root\s*").sub("", cmd)
return cmd
def __str__(self):
s = "Run %s" % self.cmd
if self.mon != None:
months = ("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
s = s + " in " + ",".join([months[x] for x in self.mon])
if self.dom != None:
tmp = ",".join(["%sth" % x for x in self.dom])
tmp = tmp.replace("1th", "1st")
tmp = tmp.replace("2th", "2nd")
tmp = tmp.replace("3th", "3rd")
s = s + " on " + tmp + " day"
if self.mon == None:
s = s + " of every month"
if self.dow != None:
week = ("sunday", "monday", "tuesday", "wednesday", "thirsday", "friday", "saturday")
s = s + " on " + ",".join([week[x] for x in self.dow])
if self.hr != None:
if len(self.hr) == 1 and len(self.min) == 1:
s = s + " at %s:%s" % (string.zfill(self.hr[0],2),string.zfill(self.min[0],2))
else:
s = s + " at " + ",".join([str(x) for x in self.hr])
if self.dow == None and self.dom == None:
s = s + " every day"
else:
s = s + " at %s minutes" % ",".join([str(x) for x in self.min]) + " of every hour "
return s
class NotACronJobError(Exception):
"""An exception raised by CronJob to indicate that the line in question doesn't contain a vaild cron schedule information."""
def __str__(self):
if self.args[0] == "EMPTY":
return "Empty Line"
elif self.args[0] == "COMMENT":
return "A comment: %s" % self.args[1]
elif self.args[0] == "VARIABLE":
return "An environment variable: %s = %s" % (self.args[1], self.args[2])
elif self.args[0] == "GARBAGE":
return "Uncronish thingamabob: %s" % self.args[1]
else:
return "If you don't know how to play with me, go to the other sandbox!"
if __name__ == "__main__":
for line in sys.stdin:
try:
print CronJob(line)
except NotACronJobError, err:
print err
Friday, May 16, 2008
Restoring MySQL databases CLI trick
It is very easy to dump and restore a database using mysql and mysqldump CLI utilities, just
and you are all. Unfortunately, if your database is several gigabytes and takes a long time to restore you might want to have some sort of output, to indicate where in the process your backup or restore is. For backup you just add -v flag to your mysqldump command and it will throw out some information about which table it is backing up. What about restore? While it is definitely possible to just go and check what table is being restored (mysqldump dumps tables in alphabetical order), I came up with a little clever trick to make the restore progress obvious and similar to mysqldump. Just add perl.
# backup
mysqldump --single-transaction mydb > dump.sql
#restore
mysql mydb < dump.sql
and you are all. Unfortunately, if your database is several gigabytes and takes a long time to restore you might want to have some sort of output, to indicate where in the process your backup or restore is. For backup you just add -v flag to your mysqldump command and it will throw out some information about which table it is backing up. What about restore? While it is definitely possible to just go and check what table is being restored (mysqldump dumps tables in alphabetical order), I came up with a little clever trick to make the restore progress obvious and similar to mysqldump. Just add perl.
cat dump.sql | perl -ne '/Table structure for table \`(.*?)\`/ && do {chomp($t=`date`); print STDERR $t . " loading $1\n";}; print' | mysql mydb
Friday, March 14, 2008
Why I don't like Debian based distributions.
I have been happily using Fedora for a while now, but I keep a close eye on Ubuntu development, since it is my humble opinion, that nothing, at the moment, compares to Ubuntu in ease of use, hardware compatibility and general togetherness. I recommend Ubuntu to people who want to try out Linux, I ran Ubuntu myself for a while, I run beta versions of Ubuntu releases and file bugs (well, when I have time). Now I also have an Eee PC laptop running Ubuntu. I like Ubuntu. But I run Fedora as my main OS. The reason for this is Ubuntu being Debian derivative and as such dragging with it all the horrible Debian legacy. I honestly wish Ubuntu chose a different What is horrible about Debian? Well, this post intends to list a few things that annoy the hell out of me and that IMHO should have been fixed ages ago. Yes, I am aware, that I blaspheme.
- Package installation procedure - when a list of packages is being installed or upgraded, Debian package manager or DPKG does this procedure in stages. That is it will first unpack all the packages, then run all the pre-install scripts, then install the files, then run the post-install scripts etc. (I am not trying to be correct about the exact steps here). And while this behavior might seem logical therein lies a problem. If, for example, one of the packages' post-install scripts fails, dpkg reports a problem and quits and all the rest of the packages remain unconfigured. True, dpkg will continue where it left off when the suer resolves whatever problem is causing the script to fail or removes the offending package, but this is not the point. Lets consider a case where actual updated package is broken and script fails because of a syntax error. Once dpkg fails the system ends up in rather strange state. All the services that were to be updated were stopped, but weren't started again (since that happens in the post-install). New libraries were unpacked, but ldconfig weren't run. New kernel might have been installed, but new initrd wasn't generated and boot manager wasn't updated. Basically we have a broken system that needs careful fixing by a specialist who knows what he is doing. And even if you do know what you are doing, your choices are limited. You need to either fix the script yourself, repackage and reinstall, but that makes your system somewhat inconsistent or you need to completely remove the package, rerun dpkg to finish the install/upgrade of other packages in the queue and try to reinstall the old version back, but that might not be possible since all the other packages might prevent the old version from being installed, so you need todescend the dependency hell and start selectively uninstalling and downgrading packages to get a working version of whatever software. Yes, some of it is also true about RPMs, but at least when one of the RPM installs fails all the rest of the packages are either NOT installed or installed COMPLETELY nothing except possibly the broken package is done half way.
- Package state markings - as I mentioned in previous paragraph, when dpkg fails to do some of its tasks it can be rerun and will proceed from the point it stopped (or fail in the same place). This is done by having very granular records of package state. APT seems to like to mark packages just a little bit too much and annoys its user. Lets say, I have started an install of a package that needs a 100MB of dependencies and suddenly I need to go somewhere. So, I hit CTRL-C, close the laptop and run out. Later I find that my laptop doesn't have a reader of some sort installed, for example FB reader and I need it right away to read some document. I hit apt-get install fbreader, but suddenly the whole 100MB of stuff starts downloading again. Why? Because APT marked all those packages for installation and will install them unless they are unmarked. Honestly I don't know how to easily unmark packages marked for install/upgrade short of doing dpkg --get-selections, manually editing the resulting list and piping it into dpkg --set-selections. There maybe a way to do this using GUI interface such as synaptic, but at a glance I couldn't find it. Other example of this "feature" is when you are trying to remove packages. Sometimes you see a package and think "I don't need this, why is it installed", so you dpkg -P it. And suddenly dpkg tells you, that the package is actually a dependence of something or other. But although dpkg proudly reported that it did not remove the package in question because of dependency problems, it DID however mark the package as "to be removed", so if ever the dependencies change this package might just disappear without any intervention.
- SysV scripts - Debian like most other Linux distros uses SysV startup. One feature though seems to be specifically done to annoy the hell out of the user. Every time a service that has a startup script is upgraded, it automatically setup to start at boot. Even if it was manually turned off before. In Fedora I can say chkconfig httpd off and Apache will not start until I say otherwise. On a less sophisticated system I can say something like rm -f /etc/rc?.d/S*httpd to achieve the same result. On Debian I can update-rc.d -f remove apache, but once an upgrade to the apache package is installed it will reinstate itself on its default runlevels and happily start on boot. As far as I know, there is NO way to prevent this. Ridiculous.
- Package management command set - this is not as much a problem as a way to a lot of confusion and it is not restricted to the package management system. There is just too much legacy in the Debian native commands. The package system provides a very good illustration. In Fedora I generally use two package management commands, yum and rpm. Yum mostly works with remote repositories and handles package installs, upgrades etc. RPM works with locally installed packages and manages installing from local file, querying the package DB, removing packages, package signing keys etc. In Debian it is not as simple. To install from remote repositories I use either apt-get. To search remote repositories I use apt-cache. To install from local file or remove package I use dpkg. To query package database I use dpkg-query. To manage keys I use apt-key. Each of these has its own specific subcommands and flags.
- DEFOMA - the Debian Font manager. Basically this is a convoluted something that is supposed to make all the font management automagical. Unfortunately all it seems to do is confuse anyone who tries to figure out what happens to fonts on the system.
Tuesday, November 27, 2007
Yet another post about firefox extensions
Previously I have written about various useful extensions for Firefox. Recently I have tested quite a few extensions that didn't make my "install these first" list and although I do not think any of these are of the "you are not browsing right if you do not have this" grade, but I find some of them rather nice additions to my web experience.
- CustomizeGoogle - is one of the subtle, yet extremely powerful extensions. Once you install it, suddenly your experience with Google search, GMail, Google Calendar and other Google products just becomes nicer. You get Google Suggest keywords while you type, GMail auto-redirects to an encrypted version, you get links to other search engines in your search results, Google Images starts to actually point to images etc.
- SpeedDial - if you ever used Opera, you already know what this is about. Basically, it adds a special location (you can configure it to be your home page) that shows thumbnails of several (nine by default) sites of your choice with handy shortcuts to go straight to these sites. I used to keep a lot of tabs open at all times in my Firefox sessions, in order to have all the reference documentation I need at hand at all times. Now I just assign relevant pages to my Speed Dial and voila, CTRL-1 gives me a tab with Apache 2.2 manuals, CTRL-2 - tab with MySQL reference manual etc. Or I can just open a new tab and click on whatever I need right there. Again, I can see people saying that this is just an unneeded addition to bookmarks and bookmark toolbar. Bookmark toolbar takes screen space. Bookmarks are nice, but since you cannot assign shortcut to a particular bookmark (as far as I know), Speed Dial actually does speed up getting to your favorite sites even if just a little bit. As an alternative, one can always use bookmark keywords (one of the more obscure Firefox features). For example, you can bookmark Slashdot.org and assign keyword slash to it. Then you use CTRL-T to open new tab, CTRL-L to switch focus to the location bar, type slash and hit enter. This is much faster, then browsing bookmarks menu with a mouse (especially for the keyboard oriented people like me), but not as fast or visually friendly as using Speed Dial extension. After all with Speed Dial you do not need to remember keywords. Note: Obviously some of these arguments are useless for people who use mouse more then keyboard. But I would guess that with one of the mouse gesture extensions you should be able to map Speed Dials to gestures.
- Secure Login - this one is even more subtle. If you are using Password Manager to remember your login information, you might sometimes be annoyed that it fills out your login info weather you actually want it or not. The Secure Login will change this behavior to a more appropriate. Every time there is a login form on the page, Secure Login will search the Password Manager for a fitting login/password combo and if it finds one it will highlight the form fields with yellow, light up an icon in the status bar and may, if configured, even play a sound. It will prevent the P.M. from filling the info into the form. Pressing a shortcut key or clicking a toolbar button will fill the form and submit in one motion (or just fill the form if you are so inclined). It can warn you if the form is attempting to submit something to a domain different from the page the form is located on and will show a popup to indicate where the form will be sent.
- Resizable Form Fields - does exactly what its name suggests. It allows you to resize text fields, text areas, combo-boxes and lists. Well... most of the time at least. I have seen a few sites where it doesn't work (probably due to absolute positioning or some other CSS tricks). But where it works it is a nice feature to have.
- TrashMail.net - will add a menu item "Paste a disposable email address" next to Paste. When used it will use trashmail.net site to generate a temporary email address. This is very useful when trying to read an article from some suspicious site that requires registration.
- BugMeNot - will use the bugmenot.com login database to login into those annoying sites that require you to register in order to read. New York Times is one of the popular examples. Yes, this is a morally questionable practice, but those compulsive registration dudes are just soooo annoying and I am not a lawyer to be able to properly read their "Privacy Policy" documents :)
- URL Fixer - this is probably the subtlest one. It will quietly fix basic typos in URLs. Ever typed www.google.con or wwww.gmail.com? No more.
- ScrapBook - this is one of the more non-obvious and extremely powerful add-ons. ScrapBook will allow you to properly gather and organize the data you mine on the web and will give you some tools to properly work with the materials. On a more particular note, ScrapBook will allow you to save a page or a fragment of a page completely to your hard drive. It will allow you to organize these fragments and pages into folders (same as you would organize bookmarks). It will allow you to mark up (same as with a highlighter pen) parts of the pages you saved and add notes and annotations. Since ScrapBook will actually save the data locally you will not worry about the data going off line or changing at the original location. This is a beautiful tool to do research on the web.
Wednesday, October 31, 2007
VirtualBox - the VMware alternative
Yesterday I have discovered VirtualBox. In short, VirtualBox is yet another virtualization package. It provides more or less the same function as VMware, Xen, Qemu and VirtualPC. At the moment it is happily running a FreeBSD world build as a guest on my Fedora 8 workstation. I cannot say that my testing of this product is complete, as far as first impressions go, this is fairly favorable. Out of the features Lets split these impressions into three usual categories.
The Good:
The Good:
- Support virtualization extensions of the modern CPUs
- Seems less I/O intensive then VMware
- Works on FreeBSD
- The GUI is somewhat clunky
- No script to automatically configure the kernel module and network
- In order to activate the kernel module, I had to guess the location of the module source and run make && make install from CLI.
- In order to activate bridged networking I had to manually configure ethernet bridging
- Once the VM crashed without any reason
- Sometimes FreeBSD guest seems to have some problems with the virtual CPU.
Tuesday, July 31, 2007
MySQL features I would kill for.
It seems that nowadays there is a trend in writing "top 10 features I want software X to have". I have seen at least two such posts about
MySQL, here and here. So, since I have been working with MySQL for a while, here is my list:
MySQL, here and here. So, since I have been working with MySQL for a while, here is my list:
- File per table backup mode for mysqldump that would work with --single-transaction flag
- Clustering without the NDB in memory storage
- Ability to turn logs (query, binary, slow queries) on and off without restarting
- Ability to setup log filters (such as log queries using particular table into a separate file or log queries scanning more then 10K rows)
- Ability to use bound variables in prepared statements properly (such as use variables in LIMIT or pass table names in the variables)
- Proper implementation of views (proper, as in not involving running a select every time a view is queried)
Subscribe to:
Posts (Atom)
