GETOPT.ORG

Welcome to GetOpt.Org!

This website is just a no-frills, no-nonsense placeholder to keep public information about various projects I've been involved in over the last couple of years. You shouldn't expect any flashy DHTML here - just a bunch of links, and of course the content. :-)

About me

A few facts about me: I'm 45, Polish native, born, raised and educated in Warsaw - graduated with M.Sc. in Electronic Engineering from Warsaw University of Technology. I'm married to a beautiful girl named Beata (as in "beatitude"), and I'm a happy father of two - 11 year old Peter and 8 year old Cathy. I'm also a born-again Christian.

Regarding my professional experience, you can read my CV here (PDF file).

I'm currently self-employed, and I provide consulting services in system integration and information retrieval through my company SIGRAM

This website is dedicated to various projects and professional interests of Andrzej Bialecki.



Projects


Luke
Lucene Index Browser.
Launch with Java WebStart

License: Apache 2.0
Lucene is an Open Source, mature and high-performance Java search engine. It is highly flexible, and scalable from hundreds to millions of documents (the tests I know about used 43 mln documents, achieving < 10s search times).
Luke is a handy development and diagnostic tool, which accesses already existing Lucene indexes and allows you to display their contents in several ways:
  • browse by document number, or by term
  • view documents / copy to clipboard
  • retrieve a ranked list of most frequent terms
  • execute a search, and browse the results
  • selectively delete documents from the index
  • and more...
Stempel
An algorithmic stemmer and lemmatizer for the Polish language.
License: Apache 2.0
Stemming and lemmatization are important tools for Information Retrieval. They help to improve recall , especially in case of highly-inflectional languages like the Slavic family.
Unfortunately, in case of Polish language there is only one freely available dictionary-based stemmer by Dawid Weiss, and no open source algorithmic stemmers exist. This project aims to fill this gap with a high quality open source implementation, based on algorithmic approach. The algorithm and its implementation come from the Egothor project.
This project also provides stemming tables built from high-quality corpora.
Current version of the stemmer achieves ca. 95% accuracy of stemming, for previously unseen word forms, and ca. 75% accuracy of lemmatization. This means that even at this early stage it is very useful.
Murmur Hash

License: Apache 2.0
A Java implementation of a fast hash function, created by Austin Appleby (see this site for more information and discussion of its performance).

This is a very fast hash, with excellent avalanche behavior. Compared with the FNV1a32 (see below) it's roughly 10 times faster, and roughly 5 times faster that a Java version of Jenkins' hash ( available here).
This implementation is a Java port of a C version of MurmurHash 2.0 .
FNV1 Hash

License: Apache 2.0
A Java implementation of fast hash functions, originally created by Glenn Fowler, Phong Vo, and improved by Landon Curt Noll.

"FNV1 hashes are designed to be fast while maintaining a low collision rate. The FNV1 speed allows one to quickly hash lots of data while maintaining a reasonable collision rate. The high dispersion of the FNV1 hashes makes them well suited for hashing nearly identical strings such as URLs, hostnames, filenames, text, IP addresses, etc."

This is a straightforward port of the public domain C version, written by Landon Curt Noll (one of the authors), available from his website.
UPDATE: please see this discussion on the poor avalanche behavior of this hash. A better alternative would be the Jenkins' hash or the MurmurHash.
Thinlet
Lightweight Java GUI framework, based on XUL concept.
License: Lesser GPL
Thinlet is a small one-class XUL engine, which allows you to build fairly complex Java GUI applications, and to keep the "View" part in MVC (Model-View-Controller) well-separated from your business logic. Thinlet-based applications are characterized by small memory footprint for GUI (Thinlet engine weights less than 50kB), pleasant look'n'feel, and rapid development cycle. Please see my project Luke for a non-trivial demo of an application based on Thinlet, or visit Thinlet home page to see other demos.

I also successfully developed a few mobile applications based on a J2ME version of Thinlet.
ebXML / ebTWG
UN/CEFACT effort to build a framework enabling enterprises of any kind to conduct business electronically.
Ongoing work (will it ever end...) on standardizing business models and interfaces for e-commerce. I was involved in the discussions on the use of Resource-Event-Agent (REA) economic modeling framework in ebTWG models. REA framework provides some of the central concepts there... You can take a look at REA Ontology page where you can find a formal ontology in Protege format. You may want to visit also my ebXML Ontology page.
ECIMF
(E-Commerce Integration Meta-Framework)

EU standardization project in the area of e-commerce interoperability
This 18-month project attempted to come up with high-level meta-modeling framework and set of guidelines for successful e-commerce integration between different standards. The mirror of the original project website can be found here, although the final deliverables can only be obtained from the CEN / ISSS website.
PicoBSD
One-floppy version of FreeBSD, for embedded applications

License: BSD
I started this project mostly just to see if it's possible to squeeze FreeBSD into 1.44MB - prompted by the famous QNX demo floppy. To make the long story short, it was possible to do it :-) although it involved writing from scratch many of the basic system utilities (init, shell, ps, vm, netstat etc). I even created a version of the floppy with a simple windowing system (called 'W'), which included a graphical WWW browser... It wasn't so straightforward at the time, and I learned a couple of tricks on how to minimze the space demands of FreeBSD while keeping the functionality... Later on I realized that such minimalistic version would work quite well also in embedded environments, and so the project went on this way from now on...

I'm no longer actively involved in the project - the official website is therefore horrendously out of date. However, the project is still active, and the best source for most up-to-date PicoBSD is the official FreeBSD source tree, under src/release/picobsd. There is also a dedicated mailing list freebsd-small@freebsd.org where you can find some help.
FreeBSD
A powerful Unix-derived operating system.
License: BSD
Apart from being involved in PicoBSD project, I also made some contributions to the other parts of FreeBSD. I was one of the strong supporters of using Forth for the new bootloader design, I also implemented a subset of standard "cons25" terminal emulation in the bootloader. Other significant contribution was to implement ability to create "sysctl" nodes dynamically (before they were created statically using linker sets, during kernel linking).

FreeBSD is a very powerful platform for server applications, and I'm using it frequently in situations where a Unix-like server is required. I like the clean and well thought-out layout of the system, consistent design. If I should mention one fundamental aspect of FreeBSD that sets it apart from Linux (besides the business-friendly license, of course :-) then it would be the fact that FreeBSD is being developed as a complete system, i.e. not only kernel but also with all standard Unix tools, and that all sources to all parts of the system are distributed as a complete, tested and ready to build tree - which makes the maintenance of FreeBSD systems a breeze. If you add to this the famous "ports" system - only now being copied by other OS-es - then I think you have a clear winner. But I'm biased, of course... :-)