Archive

Archive for the ‘Tech’ Category

Making Amarok the Default Handler of MP3 Players in Hardy Heron

September 4th, 2008

So I’m in the process of moving through the upgrade chain (new gaming rig, move desk top to old gaming rig, move server to old desktop, etc) and so was installing Ubuntu 8.04 onto my desktop and all is going well.  At some point I plugged my Sansa Express in to recharge and move some files and Rhythmbox opened.  I don’t use Rhythmbox to do music, I use Amarok.

Amarok has crossfading and musicbrainz integration and decent song management which is just about all I need out of a music player.   So once I get Amarok installed (sudo apt-get install amarok), Rhythmbox still pops up when I plug in my Sansa.

This should be a simple to fix.  From the file browser choosing Edit -> Preferences and click on the Media tab.   Go to the Media Player drop down the select Amarok….

You probably noticed that Amarok isn’t actually there.  Epic failure.  With extensive Google-fu and reading a 5 page discussion on the closed Hardy Heron beta forums I found an answer.  I can’t imagine it is the answer imagined by the Ubuntu/Gnome/Nautilus people, but it has the advantage of working.

The file associations in Hardy Heron are stored in /etc/gnome/defaults.list.  I don’t know if there should be a personal override version of this list (it is possible that the personal version of this file lives at ~/.local/application/defaults.list).  If you edit this file as root, or your personal override file and change the line:

x-content/audio-player=rhythmbox.desktop

to read

x-content/audio-player=kde/amarok.desktop

It should work.  If it doesn’t change the line to

x-content/audio-player=amarok.desktop

and create a symbolic link

sudo ln -s /usr/share/applications/kde/amarok.desktop /usr/share/applications/amarok.desktop

Now, if you go back to where we began– from the file browser choose Edit -> Preferences and click on the Media tab.   Go to the Media Player drop down the select Amarok…. and everything should work fine.

Tech

Move a File w/ Directory Maintenence

August 26th, 2008

I find myself often wanting to move a file in a directory tree to same place under a different parent directory.  Usually this is because I have a directory tree that I want to separate into different bins depending on criteria.  The upshot of which is that I’ve written python code similar to the below sample a bunch of times, so it seemed time to stick it somewhere where I could easily find it.  At work it is safely stuffed in svn, but I don’t really have a svn repository for home.  Hmmm….

Anyway this just moves a file.  It creates the new directory structure before moving the file and prunes off any empty directories once the file has been moved.  It thows an OSError if there are permissions issues at any point.

So here’s a simple movefile in python.

import os
import os.path

#Just some junk for testing
if 1:
    newroot = '/misc/tkusterer/test/old/'
    oldroot = '/misc/tkusterer/test/new/'
else:
    oldroot = '/misc/tkusterer/test/old/'
    newroot = '/misc/tkusterer/test/new/'
src = os.path.join(oldroot, '2/3/hello.txt')

def movefile(src, dst):
    try:
        os.makedirs(os.path.split(dst)[0])
    except OSError, e:
        # Already Exists
        if e.errno != 17:
            raise
    os.rename(src,dst)
    try:
        os.removedirs(os.path.split(src)[0])
    except OSError, e:
        # Leaf directory not empty
        if e.errno != 39:
            raise

if __name__=="__main__":
    dst = src.replace(oldroot, newroot)
    movefile(src, dst)

Tech

Agents of Privacy

February 20th, 2008

Phone BlocksIt’s been a few days since the questionably named Protect America Act expired and we haven’t been killed yet. No reason we should be the same wire tapping continues unabated, but now they have to go to the FISA court which never learned the word no. But it is at least has the appearance of oversight. The court could, it if felt the need, actually require some indication of wrong doing. It probably already requires something other than a fishing expedition, thus it’s too much of a hardship to the current administration.

One of the reasons the Protect America Act expired was because congress couldn’t decide whether to give phone companies blanket immunity for its past misdeeds in giving slews of data to the government, which the government was not authorized to ask for. It’s unclear whether they broke any laws but it seem clear they had a huge failure of moral judgment. I would argue that phone companies have at least a moral if not legal obligation to act as my agents to protect my data. I suspect in a few years most people will believe this but at present it’s just wacko techies who think about the implications of other entities holding your data.

Running it through from a very basic level if an entity (used here mostly as company and/or organization) asks for and you give them your information or even if they collect this data in their normal role as service provider, it seems rational that they have incurred a basic obligation to take reasonable precautions to protect that data. At the very least you expect them to secure their systems such that the Russian mob isn’t simply using their servers as data feeds.

Most people have the expectations that an entity that they have given data will attempt to protect it. In a sense they have become our agents. In accepting our information part of the expectation is the idea that they will act on our behalf to protect it. It’s a subtle point that I don’t pretend exists in law yet, but it is how the relationship is viewed by most people if they thing about it. Part of the unspoken contract between your favorite online vendor and yourself is this idea that your data won’t escape into the wild.

I’m not so naive as to think that vendor isn’t selling my data to various third parties in one form or another. As much as I wish it weren’t true, I understand that the use of my data in the entities interests is part of the deal. For that matter I encourage the use of my data in aggregate. But we have certain expectation of who the vendor is selling our data to and for what purposes.

For example I had a retailer (rpgnow.com) sell my email address to a spammer, this violates the understanding I had with them about the acceptable use of my data. As a result I no longer do business with them.

We have already begun to see along the fringes, retailers who are competing on issues of privacy. I suspect this will become more central as the years move on. We will have companies competing on their skill at being good agents for protecting our data.

Which leads us back to the Protect America Act. These companies who collect our data for the purpose of saving money, anticipating demand, allocating capital resources and other competitive advantages, have also incurred the responsibility of acting as our agents. The government isn’t coming to me with a warrant, National Security Letter or other ‘instrument’ so I can’t verify that the ‘I’s are dotted and the ‘T’s are crossed. That falls to my agents.

Those companies now want to given immunity for falling to be good agents. The government wants to send the message that companies should ignore the law and just do whatever the people from Washington in suits tell them to. Don’t worry Washington will take care of any fallout. They want these companies to act as agents of Washington rather than agents of their customers.

We can not let this happen, we need these people to have in interest in making sure they give Washington all the data required by law, but not a bit more. They must remain motivated to make sure that everyone is operating within the constraints of the law and with proper oversight. They are acting as my agents in this regard and they have a moral obligation to protect my rights in this regard.

One day the law will reflect that.

Justice, News, Politics, Tech

Lies, Damn Lies, and Referrers

February 6th, 2008

So looking through my log stats I saw a sudden spike in people coming to my page searching for “computers internet blog”. While I have pages that match those words, I can’t figure how my pagerank for that search would result in any number hits coming to my pages from that search, much less it being my leading search.

About half of the hits on my pages that don’t get automatically filtered out as robots have no referrers. Since, if a dozen people in the world have an actual bookmark to this page I’d be proud, I consider 99%+ of those to be bots I can’t identify. Also all the regular visitors tend to view the same number of pages as they have hits, which means they are aren’t downloading the .css file, or the picture of the moon up there, much less the little icons and other included files in making a webpage. It is possible they are all reading my page with lynx or some similar text only browser, but my browser stats don’t support that.

No, I expect half of the traffic I get that slips through the robot filter are robots. I like to know this, but it doesn’t bother me very much. It’s the referrers that I pay attention to. I get a list of all the pages with links that people clicked on to get to my page. Sometimes I can’t find them or they are hidden behind passwords but it seems likely they are real people clicking on real links pointing to pages of mine.

The other thing of interest is the search phrases reported by Google and other search engines. I assume those are also real people searching for real thing and ending up on one of my pages.

In comes the phrase “computers internet blog” which there is no way six people a day are coming to my page with that search. And I am right. A little Googling will indicate they are are a essentially a comment posting bot. In research I came across a line that a real Google referrer has much more stuff that is missing from these google referrers. From my logs the offensive referrer look like this:

“http://www.google.com/search?q=computers+internet+blog”

An actual Google referrer looks like:

http://www.google.com/search?hl=en&q=computers+internet+blogs&btnG=Google+Search

And can sometimes have much more stuff. I grepped through my logs looking for similar patterns.

grep "www\.google\.com/search\?q=[^&"]+” access.log

There were more than just the “computers internet blog” searches. I had searches for “nylons”, “golf cart used parts”, “shipper”, etc. In other words they were throwing off the statistics I trusted as human and more than just the “computers+internet+blog” as well. I considered my options and decided to give them the ax.

Google works fine using the url that the spammer is using but it won’t generate it itself. A sophisticated user who writes there own google urls might generate it but for the moment I’m willing to consider anyone who is using that url format as a robot (mostly because I didn’t see any legitimate (i.e. search items for which I have pagerank) use of that construct in my logs).

I added the following to the directory section of my config file, though it would work just as well in the .htaccess file.

SetEnvIfNoCase Referer "www.google.com/search?q=[^&"]+"  spammer

# Bad bot, no cookie!
Order Allow,Deny
Allow from all
Deny from env=spammer

The exact placement will depend on your config file. You want to be very careful with this, if you mess up the regular expression, you may be blocking people coming from google, which doesn’t sound like a winning strategy.

If you change the config file you’ll want to reload it:

/etc/init.d/apache2 reload

And don’t forget to test to make sure Google still works and the bad referrers are blocked.

Tech

Wordpress Canonical Names

January 16th, 2008

So the problem is you have wordpress installed and you want to respond to multiple domain names, but you don’t have a complete list of domain names which the user may use. In other words you wnat wordpress to work with *.domainname.tld. If you do nothing you will get an error like the following.

Warning: main(/etc/wordpress/config-hostname.domainname.tld.php) [function.main]: failed to open stream: No such file or directory in /etc/wordpress/wp-config.php on line 6

Fatal error: main() [function.require]: Failed opening required '/etc/wordpress/config-hostname.domainname.tld.php' (include_path='.:/usr/share/php:/usr/local/share/php/php-openid-1.2.3/') in /etc/wordpress/wp-config.php on line

One way to get this working is to edit wp-config.php which should be in the root of you wordpress directory. There is a line:

require_once('/etc/wordpress/config-'.strtolower($_SERVER['HTTP_HOST']).’.php’);

Which loads a config file based on the url the clients entered which is fine if you’ve been absolutely consistent in using or not using a hostname when talking about your domain (and no one as added or removed one over the years without asking). Assuming that isn’t true you would like to accept www.doaminname.tld, domainname.tld, actualhostname.domainname.tld, etc. You can create a bunch of symbolic links, or if you only have one website on ther server you can simply change the above line to:

require_once('/etc/wordpress/config-www.domainname.tld.php');

That should use the same config file regardless of what url the user used.

Tech

EXIF

January 16th, 2008

My mother has a large collection of photos that she likes to keep organized by date and such. The software she uses to do the organization doesn’t have any sort of auto-categorization features. Mostly I’m talking about breaking out photo uploads by date of photo. Something really fancy might do some clustering on the time stamps to break out different events for a cluster of photos in the morning and another for a second cluster in the evening.

All that aside I’ve been thinking about writing her something, but it would have to be a mom ui, not my preferred command line interface. Which is all a long winded intro so I can post a link I don’t want to lose track of that points to a discussion covering the currently available windows solutions so I can check if any of them do a good enough job already.

http://lifehacker.com/345277/namexif-batch-renames-digital-photos-by-date

Tech

zombieFind

January 15th, 2008

I put up the files for zombieFind and documented the process of getting it working. Check it out at:

zombieFind

Tech

Wordpress DB Insertion

January 6th, 2008

In the old layout I had a series of pages which were copies of a column I had written for my dorm’s news some 15 years ago. The most recent iterations of these pages were done in php. All the php really did was control the layout and such but instead of converting them by hand I figured I’d write a python script to do all the heavy lifting. It was fairly successful.

import os, os.path, re, datetime
import MySQLdb

srcDir = '/tmp/working'

db = MySQLdb.connect(host='xxxxx', db='xxxxx', user='xxxxx', passwd='xxxxx')
c = db.cursor()

for root, dir, files in os.walk(srcDir):
  for file in files:
    date=editor=volume=number=None
    fileNum = file.replace('zen', '').replace('.php', '')
    try:
      # Eliminate files which aren't the ones I'm trying to convert
      fileNum = int(fileNum)
    except ValueError, e:
      continue

    # The Slug
    name = 'zen-%s' % fileNum

    text = open(os.path.join(root, file)).read()

    # Extract the Date (or a string of any sort) from a variable
    matches = re.search('$Date/s*=/s*["']([^'"]+)["']', text)
    if matches != None:
      date= matches.group(1)
    elif date == None:
      date = 'September 1, 1992'
      date = datetime.datetime.strptime(date, '%B %d, %Y')
    # Extract a number
    matches = re.search('$Volume/s*=/s*([/d]+)', text)
    if matches != None:
      volume= matches.group(1)
    # Snip volume and other extractions

    title = "Sir High Lord Zen - Volume %s, Number %s" % (volume, number)

    # Pull off the header
    text = text.split('WriteNavControls', 1)[1]
    text = text.split('?>', 1)[1]
    # Pull off the footer
    text = text.split('?>')[-1]
    # In the wordpress db n (linefeeds) mean something so I need
    # replace them with spaces in the source text.
    text = text.replace('n', ' ')

    #print text

    sql = """
      insert into wp_posts
      (post_author, post_date, post_date_gmt, post_content,
      post_title, post_status, post_name, post_modified,
      post_modified_gmt, post_type) values
      (2, %s, %s, %s, %s, 'publish', %s, %s,
      %s, 'post')
      """

    c.execute(sql, (date, date, text, title, name, date, date))

So the only real problem is that no category is created. Opening the created post and saving it will add the post to the ‘Uncategorized’ category. You can always add it to another category at that point instead.

There is probably some sql that could be written to automatically add an entry to the wp_post2cat table for any post that lacks such and entry but after five minutes I couldn’t think of simple way of doing it and since it was only a few dozen posts I just opened and saved them all.

insert into wp_post2cat (post_id, category_id) values (412, 22);

Will add a single post. If you have a large number of posts to add my script it would be pretty easy to write a script to generate a number of sql statements to populate the wp_post2cat table.

Tech

apt-get and gpg

January 4th, 2008

When doing an apt-get update on my debian machine and recieve and error like this:

W: GPG error: http://www.debian-multimedia.org etch Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 07DC563D1F41B907

As root doing the following fixed it:

# gpg --recv-keys 07DC563D1F41B907
# gpg --export 07DC563D1F41B907 | sudo apt-key add -

Maybe I’ll remember next time.

Tech

Resurrecting jsFind Part 2

December 18th, 2007

Given that I had an index ready to go last night all I had to do was get the javascript to work, how hard could that be? Javascript is not a language I consider myself proficient. I usually swear a lot when js is involved.

So I started by ignoring the js. Grabbed a page and hollowed it out to a template and declared it the search results page.

Then I spent a lot of time staring at js. The example seemed to do its best to hide what a developer actually had to do. I eventually found the call I needed to make. You call the method and pass the search terms and a callback method that gets the results objects and outputs them.

The callback method had some issues and so I wrote my own. Then comes the part where I waste a lot of time because it doesn’t look like search results should look. I finally get something that looks about right. Modern search results have a description/context section which jsFind is missing. This is obvious because I removed the position data when I created jsFind’s xml file. But we have results with a title, url and frequency. Good enough for now.

Reworked the search forms on the page to point to the results page. Seems the piece that process args doesn’t properly process more than one argument, so I worked around that problem by not giving the submit button a name. I’ll fix it properly later. And now I had one working page that worked like a charm.

Then I started testing.

So the page has a stylesheet imported by the @include mechanism. Which means it was still pointing to the original site. This got me wondering how I could build some js to include search.js for any location in the site hierarchy… I had the same thought about the css file. For the time being used a known key the url to find the root, it works but it isn’t very portable to another project. I’m going to have to think about a long term solution, but with a root in place I used DHTML to dynamically add the js and css files.

The search page was not portable to other directories. Of course paths returned by jsFind still point to hard coded locations. I’m going to have to use the root information to translate the locations to wherever they happen to be if the root moves (i.e. copied to a cdrom).

Then I got tired of that.

I still have to fix the portability issue, and write a script to do the search and replaces on all the html files in the corpus.

I’d like to keep the positional information and re-write mkIndex (in python) to store that information. Once I rewrote search.js to retrieve that information I could make a callback function that retrieved some context at least out of the html files to show results.

Once (if) that is done I’ll probably post that code up somewhere so others can use it if they have a need. Which is why I’m making this post to remind myself what I did when I document it.

Tech