Jonathan Gorman

So it's been a while since I've posted. I've been busy of late. Recently my dog Veronica tore her ACL. She needs to be kenneled if she's not laying on a pillow with someone nearby holding her leash to make sure she doesn't try something stupid. Like running for a door. Or jumping onto a couch. Not being allowed on the couches may be the most traumatic part for her of the whole experience so far. (Actually, doing her business on a leash probably is the worse for her. The other day she actually went behind some bushes in order to do so...)

As for us, it's surprisingly exhausting sitting next to a dog for most of a night.

I figured somehow that at least it would get me time to do some writing and work on some projects, but so far that's failed to happen. I've managed to start several though ;).

Squarespace 5 to Squarespace 6

One is that I use Squarespace for this site, in part to lend some indirect support to the TWiT network and in part because I don't want to deal with patching any more software than I already do :). Squarespace just recently underwent a major version upgrade.

I've been playing around with their new system as I was intending to migrate this site over, but I can't quite get the hang of it. Images are the biggest issue, which didn't import as nicely as I'd like. (Thumbnails did, but not the originals...) I think at this point I'll keep learning the new system, but I'm delaying the rollover. Hopefully I can go to it before Christmas, but I'm not counting on it at this point.

Halloween

Colleen and I are working on decorations and our costumes, although it's mostly in the planning stages at the moment. She's found this stuff called Glowire that's pretty cool. Essentially a wire that will glow with a neonish light. We'll be going as a cyberpunk couple, cobbling together some wires and spare parts to make jacks. We need to get cracking on assembling the pieces to that. I'll post pics when we get done. It's also leading me to read up more on the history of punk.

Which leads me too...

Reading

I've been reading a weird mix of stuff lately and some of it in fits and spurts. Books near me right now are a mostly read copy of Melville's Pierre, a book on bacon, I Am Legend, The Rough Guide to Punk, George R. R. Martin's The Sandkings, The Brewmaster's Table, a history of underwear, and few more I can't quite see from here. That doesn't count some of the electronic books I have, including the very cool Oral LIterature of Africa, which was successfully unglued.

TV & Movies

Been watching a whole bunch of movies I've been meaning to see for a while. Recently knocked off the list was Cannonball Run. Right now I'm working my way through Battlestar Galactica show (the new but not so new anymore show) which is rather addicting.

And finally...

Git Happy

I've started a few projects on github. One, modifying the Barcode Scanner app for the Android to read the Library's version of Codabar was pretty easy as most of the Codabar code was already present. Uncommenting a few lines was all it took to get it working. I have also commented out some of the other formats and added a check digit algorithm. Without the algorithm it would return with junk just a little too frequently. Trade-off is now it is slow and doesn't work well in low light. I might experiment with replacing it with something less taxing, like checking the barcode is 14 digits and starts with 2 or a 3. I''m also hoping to make it so it can work with the existing app side-by-side, but need to do a little more research there.

A few months ago I put a simple demonstration of using NFC normalization with perl and MARC records and some examples of pulling information out of Voyager and putting in into Unicode databases, calledvoy2unicode. It needs some more examples, but it is a good base for now.

Speaking of MARC, I've contributed some records to two MARC code4lib repositories on github:MARCthulhu and MARC Records MARCthulhu was directly inspired by Simon Spiro as a repository of really horrible records, both for testing and just some examples of how bad it can get. I created MARC Records as it seemed like there were some useful edge cases to test for that might be only moderately mangled records. So far both repositories are pretty slim, but maybe I'll be able to promote them at this year's Code4Lib.

Well, that's about enough for now.

PostedOctober 11, 2012

AuthorJon Gorman

CommentPost a comment

I don't tend to follow a lot of video podcasts, mostly since I don't watch tv as regularly as I listen to podcasts. I can listen to audio while waiting for the bus or walking to lunch, but I'm never sure when I'm going to watch tv. Getting a roku has helped, but I still find myself more often going on watch marathons of a tv show rather than watch the tech channels.

Still, I've been watching more and more of Revision3. First started with Ben Heck's show, but now I've been watching Hak5 and also their quick tips called HakTip. (I'll probably try to catch some episodes of Scam School soon as well, but that's a bit of an aside.).

If you're a fan of Security Now over at the TwiT network, I can highly recommend watching some of the Hak5 episodes. Great info on security. I've been watching some of their old episodes on encryption techniques and the inner workings of SSL.

There's also excellent amount of material on linux. The episodes are a real welcome change from some of the more boring tech video podcasts consisting of one guy talking into a camera with some screen capture software.

Currently I'm working through the Linux 101 HakTIp episodes. Mostly it is material I already know, but they're pretty short and a useful reminder of some things I'd forgotten with just enough new stuff to keep watching episodes. If you've always wanted to get better on the command line, they're a great place to start.

PostedJuly 9, 2012

AuthorJon Gorman

Categorieswork

Tagslinux, podcasts, security, video podcasts

CommentPost a comment

Hi all,

I think I'm going to make Fun Photo Friday an irregular feature, although I'll try to make sure it doesn't go longer than a month. I might reconsider if I get more of a buffer built up.

PostedJune 21, 2012

AuthorJon Gorman

CommentPost a comment

Just a quick note. It seems there's still a lot of folks out there who haven't heard of Ed Summers' excellent little tool, http://jobs.code4lib.org. It's a semi-automated harvesting system that's pulling in a ton of job posting related to libraries. You can also filter out the @code4lib twitter posts that start with Job:, they're being supplied by this nifty little system. And like many code4lib projects, it's open source and can be found as shortimer at git.

I know Ed was hoping it might even lead to part-time and contract work where someone might set up a short-term need and a programmer with a lot of library knowledge could fill it. Say, adapt Evergreen so it interacts with some digital library software that it doesn't currently work with. I haven't seen many of those yet, but hopefully we'll see those become more common. It would make earning a little extra money consulting far easier.

PostedJune 19, 2012

AuthorJon Gorman

Categorieswork

Tagscode4lib, jobs

CommentPost a comment

Last weekend Colleen and V competed in the local Obedience/Rally competition. They did very well, earning three ribbons. All V needs to do is to get one more and she has her first title for Obedience!

Sorry for the delay (and the uncropped image, wanted ).

I might be shifting Fun Photo Friday to be every two weeks since I haven't been able to actually get my act together. However, part of my motivation for this was to both keep the blog lively and also encourage me to take more photos and actually organize them.

PostedJune 16, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

TagsColleen, dtccu, obedience, rally, v-dog

CommentPost a comment

I haven't been too good about snapping pictures, so here's another one of the V-dog. We'll see, I might end up moving this feature to once every two weeks or once a month.

PostedJune 8, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

Tagsphoto, v-dog

CommentPost a comment

The in-laws got cows! One of these two lead us on a merry chase. Luckily we managed to get him back in the field. (I also discovered about the right moo sound necessary to get a responding call.)

Sorry for the eye glare. I'm sure if I had more time I'd figure out how to fix it. But I wanted to get a photo up.

I'll try to get some galleries up in the next couple weeks of recent trips and the like, no promises though. This is going to be a busy month.

PostedJune 1, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

Tagscows, farm, vacation

CommentPost a comment

Many, many pictures on my phone are of my dog looking up a tree. Her favorite sport is to chase a squirrel up a tree and sit there. And sit there. And sit....

And once in a blue moon she'll get excited enough to let out a yip. Once someone else was panicked she'd hurt herself and we had to explain that's her "I'm going to eat you" squeal/yip.

PostedMay 26, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

Tagsdog park, photo, v-dog

CommentPost a comment

Those following me in my various social circles are probably already sick of hearing about this, but unglue.it launched yesterday. Unglue.it, started by Eric Hellman (aka @GlueJar) and other folks w/ connections to Code4Lib, is a effort to release copyrighted books to the world. Working with right-holders in a kickstarter style to raise enough money to license an ebook a creative commons non-commercial license. It's a way of "front-loading" profits so the author can be compensated for their work, but the world gets access.

They also have a mechanism for adding books to a "wishlist" that will give them an indication of works that people want and what right-holders they should track down.

This is a brilliant way to deal with some of those really important and hard to find out of print books. For example, I've wishlisted a very good biography by Greg Rickmann of Philip K. Dick called To the High Tower. It's a work that I stumbled across that's sought out by a small circle of Philip K Dick fans.

It's not clear if there's enough demand for another printing, but unglue.it offers a chance that it could be made easily available again while also giving the author further profit he's not going to see of this long-since sold out book otherwise. I have had the good fortune to read it due to the fact I work at a major library that has a ton of access of books, but I know many a sci-fi fan that doesn't have the resources I do.

I also must admit that I'm interested to see if this model works. I've thought about trying to do a small hobby side business of making value-added public domain works and perhaps doing something similar to unglue.it with near-orphaned copyright works. However, tracking down right-holders has proved troublesome enough that it's remained in my large bucket o' ideas I'd like to do someday. I'm hoping unglue.it takes off enough that it'll create a infrastructure that might make it easier to do projects of this nature.

Here's what I'm pledging to:

PostedMay 18, 2012

AuthorJon Gorman

Categorieswork

Tagscode4lib, creative commons, ebooks, unglue.it

1 CommentPost a comment

Digging through some of my older photos and found this picture I took of the moon. Given the timestamp I suspect that Colleen and I were walking the couple of blocks to our local voting place last November.

On another note, I'm going to be playing with my format for these posts. I might only post every two weeks. It turns out most of my archived photos are of my dog looking at things. That might be more exciting to me than others.

I'm also trying to take more pictures, but I'm still not much of a photographer. I'll also try to get some of the photos I took in Salt Lake City. I just don't want any one "theme" to dominate too long.

PostedMay 18, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

Tagsmoon, morning, outside, photo, sunrise

1 CommentPost a comment

Brief post. Took a picture after seeing the Mormon Tabernacle Choir.

PostedMay 11, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

Tagschurch, eluna2012, photo, travel

CommentPost a comment

I'll be at the ELUNA conference next week. I'm hoping to have some Internet access, but I doubt I'll be blogging it as it happens.

PostedMay 4, 2012

AuthorJon Gorman

CommentPost a comment

My brother's dog Louie watching everywhere my Dad walks.

The love seat Lou is on is pretty small, making Lou look bigger than his actual 20lbs or so.

PostedMay 4, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

TagsLouie, photo

CommentPost a comment

PostedApril 27, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

Tagscostumes, Renaissance, Colleen, garb, photo

CommentPost a comment

Recently I posted a recipe for green beans and carrots on a few social networks and my friend Ashley asked if she could post it on her blog, Season of the Vegan. I like cooking, although it feels weird to me to be posting on a vegan blog when I'm not even really vegetarian. I've never minded too much cooking for my friends who are vegans and I have a few recipes I've accumulated over the years. Combine that with trying to cut back on eating meat and hopefully I'll be able to keep up posting there every once in a while. If you, like me, are looking for some new things to do with veggies it's worth checking out.

And who knows, maybe I'll get around to posting some of the non-vegan stuff here.

PostedApril 26, 2012

AuthorJon Gorman

Tagsblogging, cooking, veggies

CommentPost a comment

There's an excellent series of posts over at Robot Librarian by Bill Dueber with some Solr hacking. If you're at all interested how systems like VuFind and Blacklight are searching our records, it's worth a read. The series inspired me to get off my duff and write about a useful set of tools, YAZ, that not enough people seem to know about.

Anyone dealing with the cataloging side of librarianship will at some point have a pile of records that needs conversion. It might be MARC-8 records that need to be converted to UTF-8, or perhaps a pile of MARCXML records that need to be converted to MARC.

I've seen people try to use MarcEdit or the Perl MARC::Record libraries to solve these problems. MarcEdit is a wonderful tool, but it's difficult to automate. Using the Perl libraries can take a while and there's a risk of bugs, particularly with complicated issues like character sets. Many of these simple tasks can be handled deftly by YAZ.

YAZ is centered around the Z39.50 protocol for searching and retrieving metadata records. The library offers programmers a lot of hooks for working with a Z39.50 server or even setting up their own. However, the yaz packages also offer a set of command-line tools for working with MARC records. (If you're curious about the z39.50 tool, Appendix I in my article Respect My Authority has an example.)

Don't let the fact that these YAZ tools are command-line scare you away. There's two strengths to the command-line we want to take advantage of here:

being very flexible in specifying what files should be modified
very easy to automate

Play along and get some records

The Internet Archives has a entire section devoted to records, Open Library Data. For example, you can go download some MARC records from San Fransico Public Library. I decided to download the file SanFranPL12.out pretty much at random. One word of warning, most of these collections are rather large and so might take some time to download.

The next few sections require you to have a terminal open if you want to follow along. If you don't know what the terminal means, jump down to "Getting to the command-line" at the bottom of this post. You'll also need to follow the yaz install instructions. If you're a linux user, I'd recommend compiling from source or installing the libyaz and yaz package from IndexData. Most linux distributions seem to have an ancient version of the program in their package repositories.

I downloaded the file SanFranPL12.out to ~/blog/yaz_examples and typed cd ~/blog/yaz_examples. (The ~ is a shortcut for home directory in most Linux/Unix systems).

Quickly viewing records

Typing yaz-marcdump SanFranPL12.out | more gives a readable version of the files you can page through by hitting the space bar. You can quit by hitting q or control-c. Yaz-marcdump by default converts marc records into a marc-breaker type format. The | more sends it to the "more" program for paging through the results of the conversion. (Normally I'd use the pager less which has more features, but Windows systems don't usually have less installed).

The results look something like...

02070ccm 2200433Ia 4500

001 ocmocm53093624
003 OCoLC
005 20040301153445.0
008 030926s2003    wiumcz         n    zxx d
020    $a 0634056603 (pbk.)
028 32 $a HL00313227 $b Hal Leonard
040    $a OCO $c OCO $d ORU $d OCoLC $d UtOrBLW
048    $a ka01
049    $a SFRA
050    $a M33.5.L569 $b K49 2003
092    $f SCORE $a 786.4 $b L779a
100 1 $a Lloyd Webber, Andrew, $d 1948-
240 10 $a Musicals. $k Selections; $o arr.
245 10 $a Andrew Lloyd Webber : $b [18 contemporary theatre classics] / $c [arranged by Phillip Keveren].
260    $a Milwaukee, WI : $b Hal Leonard, $c [2003?]

300    $a 64 p. of music ; $c 31 cm.

The first line is the leader and the rest of the lines are parts of the first MARC record in the set of records. (Since this is one file composed of multiple MARC records).

The position 9 in the leader seems blank for all the records I randomly sampled which means that they're encoded in marc-8.

Yaz-marcdump converting from marc-8 to utf-8.

(Quick reminder if you're following along and tried the above, hit q or control-c to exit more)

Converting a file to marc-8 is pretty easy, just type the following:

yaz-marcdump -f marc-8 -t utf-8 -o marc -l 9=97 SanFranPL12.out > SanFranPL12_utf8.mrc

Let's break down the various options

-f marc-8: The input is marc-8.
-t utf-8: The output should be utf-8.
-o marc: The output should be in marc. (Other commonly used options include line-format and MARCXML)
-l 9=97: The leader should be set to a. (97 is the decimal character code for a in utf-8).

Now try doing yaz-marcdump SanFranPL12_utf8.mrc | more, you'll see that the leader has the character 'a' in the leader 09 field. There's also an argument -i where you can supply the input format, but this defaults to marc. The documentation says you can use a character like -l 9=a instead of the decimal character code, but I've never gotten that to work.

Yaz-marcdump converting from marc to marcxml.

Converting to marcxml is just a matter of changing the output format from -o marc to -o marcxml.

yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 SanFranPL12.out > SanFranPL12_utf8.xml.

This can be really, really handy as there's many processes that can manipulate MARCXML that can't touch MARC.

Debugging a record

Some systems do not give a very detailed error message when they reject a MARC record. This is where yaz-marcdump's verbose mode can come in useful. I've taken one of the marc records from the SanFranPL12.out file and inserted some characters and didn't adjust the directory in the record, which will cause errors in some systems.

The record is for "Reality Check!", Volume 2 and i added "Codexmonkey was here!" to the beginning of the 245 subfield $a. This causes the information in the leader and directory to be wrong for this record. if you download the bad record and run through yaz-marcdump bad_record_mod.mrc | more you'll get warnings about separators being in unexpected places. You can download the unmodified record and run yaz-marcdump bad_record_orig.mrc | more and you'll notice you won't get the warnings.

Adding the command option -v produces a verbose output that shows how the file is being parsed by the yaz-marcdump program. This generates a lot of information but can be really useful if you want to understand how programs understand marc records. Let's look at some snippets from yaz-marcdump -v bad_record_mod.mrc | more and yaz-marcdump -v bad_record_orig.mrc | more

(Directory offset 132: Tag 092, length 0018, starting 00232)
(Directory offset 144: Tag 100, length 0019, starting 00250)
(Directory offset 156: Tag 245, length 0097, starting 00269)
(Directory offset 168: Tag 260, length 0037, starting 00366)

This occurs early in the program and is when yaz-marcdump is actually parsing through the directory, the part of a marc records that describes how long each variable field will be. Any parser will expect Tag 245 to be 97 bytes long, but I added a bunch more by just typing it in via the vi editor.

Let's first look at the non-modified record when it gets to the 245 tag.

(Tag: 245. Directory offset 156: data-length 97, data-offset 269)
245 10 $a Reality Check! $n Volume 2 / $c by Rikki Simons ; & [illustrations by] Tavisha Wolfgarth-Simons.
(subfield: 61 52 65 61 6C 69 74 79 20 43 68 65 63 6B 21)
(subfield: 6E 56 6F 6C 75 6D 65 20 32 20 2F)
(subfield: 63 62 79 20 52 69 6B 6B 69 20 53 69 6D 6F 6E 73 ..)
(Tag: 260. Directory offset 168: data-length 37, data-offset 366)
260 $a Los Angeles : $b Tokyopop, $c c2003.

It got to the 245 tag and pulled out the 97 characters that comprise the field. You'll notice the parser is breaking the field into the subfields. The hex numbers are the characters in the subfield, including the subfield flag. (615265616C = aReal)

Now a look at the one that's been modified:

(Tag: 245. Directory offset 156: data-length 97, data-offset 269)
245 10 $a CodexMonkey was here! Reality Check! $n Volume 2 / $c by Rikki Simons ; & [illustrations by] Tav
(subfield: 61 43 6F 64 65 78 4D 6F 6E 6B 65 79 20 77 61 73 ..)
(subfield: 6E 56 6F 6C 75 6D 65 20 32 20 2F)
(subfield: 63 62 79 20 52 69 6B 6B 69 20 53 69 6D 6F 6E 73 ..)
(No separator at end of field length=97)
(Tag: 260. Directory offset 168: data-length 37, data-offset 366)
260 sh $ Wolfgarth-Simons.

The parser gets to field 245. After all the subfields have been parsed, yaz-marcdump complains that it could not find the separator that should be there after 97 bytes to indicate the field actually ended. This ends up messing the following 260 and each field after it. In this case the parser can't be sure if the directory is off or the character just happens to be missing.

Splitting a MARC file into several MARC files.

The yaz-marcdump also has some tools that can make dealing with MARC records easier. I ran into a case recently where a process couldn't handle dealing with the very large XML file that taking a file of MARC records and converting it into one giant XML file produced.

Thankfully, the yaz-marcdump tool provides the ability to split an input file into several output files, also called chunking by software geek types. Unfortunately it only seems able to do this with an input type of marc. So let's say I decided I wanted to split the original file into more manageable sized files where each one has only has 10,000 records per file and convert those to xml.

Splitting is easy, but doing some of the other steps requires some advanced command-line foo that does not work on Windows. I'll need to do this in a couple of steps. ($ is the prompt, don't type it. Just using it to make clear where new commands start).

$ mkdir split_files
$ cd split_files
$ yaz-marcdump -s sfpl -C 10000 ../SanFranPL12.out > /dev/null
$ find . -name 'sfpl*' | xargs -n 1 -I{} sh -c 'yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 {} > {}.xml'
$ mkdir ../xml
$ mv *xml ../xml
$ cd ..

Now if you do ls -1 xml/* you should see something like...

sfpl0000000.xml
sfpl0000001.xml
sfpl0000002.xml
sfpl0000003.xml
sfpl0000004.xml

Let's break down the command yaz-marcdump -s sfpl -C 10000 ../SanFranPL12.out > /dev/null

-s sfpl: This tells yaz-marcdump what to prefix to each generated file as well as to split the files
-C 10000: This is the number of records per file. It defaults to one. Also notice that it is an upper-case C, not c. Case matters.
../SanFranPL12.out: Since we're down in the split_files directory, we need to tell the tool that the SanFranPL12.out is located in the parent directory
> /dev/null: For some reason, this program will still output the files to the terminal, even though it's also writing to the files. This redirects the output to /dev/null, essentially a file that never retains any data. You can also use the command-line option -n to suppress , but then you'll still get some output as yaz tries to correct issues it sees with various records.

The really complicated line after that, find . -name 'sfpl*' | xargs -n 1 -I{} sh -c 'yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 {} > {}.xml', finds all the files with the prefix and gives that to a program called xargs, which calls the yaz-marcdump command to do a conversion for each file. It's the same as doing...

yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 sfpl0000000 > sfpl0000000.xml
yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 sfpl0000001 > sfpl0000001.xml
yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 sfpl0000002 > sfpl0000002.xml
yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 sfpl0000003 > sfpl0000003.xml
yaz-marcdump -f marc-8 -t utf-8 -o marcxml -l 9=97 sfpl0000004 > sfpl0000004.xml

If the designers of yaz-marcdump had included an option for a output file name, the line would have been a bit less ugly.

Getting to the command-line

I feel a little silly writing this section, but when teaching/training people in the past I've had some people really confused on how to get to the command-line. If you're running Mac OS X, you want to launch the Terminal application, which at least used to be in Utilities. In Windows, go to Start -> Run and type cmd. Both of these will launch a terminal window that you can type in.

The next few sections require you to have a

PostedApril 22, 2012

AuthorJon Gorman

Categorieswork

Tagslibrary, marc, yaz, yaz-marcdump

2 CommentsPost a comment

I've got a post coming soon on some basics of using the yaz-marcdump tool. Meanwhile to tide you over I have not just one, but two photos. Last week's photo reminded me of some photos I took at Allerton last May when we went there for the first time.

It was pretty muddy in the hiking trails, so we ended up first going to see the Foo Dogs. In the fenced in part with ivy V suddenly got obsessed with one spot. I figured I'd take a picture of her sniffing, when all the sudden there was some movement. I managed to snap a picture quickly...

After Colleen and I realized it was quite a young fawn (which was quite hard to see from human height), we got the V-dog away as we didn't want to spook the mother.

And after all that excitement, V got to try out to be a Foo Dog. Didn't have the heart to tell her that Foo Dogs are actually lions. At least the one she posed next to seemed to be impressed by her.

PostedApril 19, 2012

AuthorJon Gorman

TagsAllerton, deer, foo dogs, photos, v-dog

CommentPost a comment

Yay for hiking trips! Colleen, V and I went to Allerton last Sunday so I could unwind from a morning of doing updates.

PostedApril 13, 2012

AuthorJon Gorman

TagsAllerton, Colleen, photo, v-dog, veronica

CommentPost a comment

Welcome to Fun Photo Friday! Recently I was looking through some of my photos and realized it would make a good regular feature to make sure this blog never gets too serious.

Today's photo was taken a week ago as I was waiting at home for a quote for a water heater repair or replacement. The V dog spent most of the morning sleeping on the floor. It was so exhausting she had to take a nap on the couch afterwards.

PostedApril 6, 2012

AuthorJon Gorman

CategoriesFun Photo Friday

Tagsnapping, photo, v-dog, veronica

CommentPost a comment

Recently a newcomer to the Code4Lib mailing list, Cliff, posted a question asking for information about sharing code and also possible ethical considerations as some of the shared code might be based off of other's efforts.

I did a short response that focused more on the first part of his query covering some thoughts about code sharing in the Code4Lib community, which I'm cleaning up and posting here.

There seems to have been a push over the past few years in Code4Lib to share more and more code, even with small projects. There are a lot of individuals scattered about in the library world writing code to accomplish similar tasks, small and large. One common example is the glue between certain academic enterprise systems and our catalogs. This code, particularly in the past, got developed in little pockets without ever getting shared. Occasionally code sharing flourishes as a gated community surrounding a particular vendor, but I think these communities suffer by just not being large enough. There seems to be a conscious push against the tendency of isolated development by releasing often and without regard to size. GitHub in particular has made it really easy and painless to share smaller chunks of code and offer patches to projects.

I have been bad about releasing and sharing source myself. This has been a hindrance as I find myself creating similar code in different internal projects instead of taking a step back and generalizing the code. If I did, not only could the code be shared among my projects, it could be shared with the community.

There is also a barrier in our lawyers. I have not put in the energy needed to get the attention of the office that makes decisions on whether or not to release code as open source. That office also does not make it easy or comfortable to ask questions. I suspect from what I've heard that one really needs to call or try to visit in person, something I tend to sub-consciously avoid in my typical approaches to communication.

On a community level, it feels like Code4Lib is starting to see tension about releasing small projects and lots of code that manifests in a variety of ways.

There is the perception that there are projects have been abandoned or just don't have the level of support and community necessary to sustain development.
Large scale of adoption of code/projects by people who don't have the technical skills to contribute patches and need help to use the project.
Competition among projects that share goals and need to compete with each other for community. I think choices are good, but choices introduce tension and too many choices can lead to people choosing nothing. I don't think the library software world has hit that point, but I can see a future not to far away where this is more of a problem.

There have been a couple of articles over the years on these topics in the code4lib journal that describe it in more detail than the general approach I've taken here that worth reading.

First, an argument on why to just put stuff out there and why so often we seem to fail to by Dale Askey: COLUMN: We Love Open Source Software. No, You Can’t Have Our Code

On the other hand, see Terry Reese's excellent article in the latest issue presenting an argument why one should be prepared to support the code published: Purposeful Development: Being Ready When Your Project Moves From ‘Hobby’ to Mission Critical

Finally, Michael Doran gave an excellent talk a few years back that really stuck in my head with the very issue I've been reluctant to put more effort into: lawyers and code: The Intellectual Property Disclosure: Open Source in Academia. (Powerpoint slides)

In re-reading the original post, I realized I glossed over the ethical part, which is a shame. There are some fascinating issues concerning the ethical dimension of sharing code that was based and inspired off of other code. Of course, on one level are the legal issues involved with copyright and derivative works depending on exactly what "based on" entails.

However, I'm more interested in the learning and sharing aspect of code development. It is extremely useful for me to read code developed by others. Like critical reading of prose, you can learn a lot by not just trying to figure out what the code does, but thinking about how the code you are reading communicates to the reader. Does it flow? Does it jump around? Are abstractions employed that makes it easier to conceptualize? It's a fascinating topic and really deserves longer treatment with another post.

My thanks go out to Peter Murray (aka @DataG) who shared a link to my email. Also thanks to Becky Yoose (aka @yo_bj) for retweeting. In doing so they made me realize perhaps it would be worth revising and posting the email as a blog post.

PostedMarch 20, 2012

AuthorJon Gorman

Categorieswork

Tagscode4lib, open source, sharing code

1 CommentPost a comment

Git Happy and other going ons

Squarespace 5 to Squarespace 6

Halloween

Reading

TV & Movies

Git Happy

Hak5 & HakTip

Fun Photo Friday: Not Every Friday

Code4Lib Jobs

Fun Photo Friday: Obedience and Rally Competition

Fun Photo Friday: V being cute

Fun Photo Friday: Cows!

Fun Photo Friday: V's Hope

Unglue.it launched!

Fun Photo Friday: Outside in Morning

Fun Photo Friday: Temple at Temple Square

ELUNA-bound!

Fun Photo Friday: Lou Dog

Fun Photo Friday: Pirate Queen Colleen

Guest blogging over at Season of the Vegan.

Yaz-marcdump: Simple but powerful MARC batch tool.

Play along and get some records

Quickly viewing records

Yaz-marcdump converting from marc-8 to utf-8.

Yaz-marcdump converting from marc to marcxml.

Debugging a record

Splitting a MARC file into several MARC files.

Getting to the command-line

Fun Photo Friday: Allerton, Episode I.

Fun Photo Friday: Allerton

Fun Photo Friday: Veronica napping after hard work napping.