Solr setup full-text search in 5 minutes

  1. Download the latest version of solr (Solr 6.5.1)
  2. Unpack it and go into the bin directory
  3. Start it up by executing:
  4. Initialize the configuration by executing:
  5. Index your files by executing:
  6. Open the web browser and start Searching

At this point it is basically already working.

I prefer to do some more optimizations:

  1. Open server/solr/files/conf/velocity/head.vm and remove the css .result-document:hover. This gets rid of the annoying zoom effect when hovering search results
  2. Open server/solr/files/conf/velocity/hit.vm and replace

    This adds a link for each result to directly open the file. (as the link is local, you need to install the firefox extension local_filesystem_links).

Convert EML to PDF

On git hub a project can be found that does exactly that: eml-to-pdf-convertere.

The program can be used as follows:

Dealing with PST under linux

I prefer dealing with outlook archives (pst-files) by extracting the messages to a folder structure, saving each message as eml-file (Thunderbird mail-format). This can be achieved as follows:

If the command cannot be found, you might need to install the package libpst first. The command creates msg and eml files with a increasing number as the filename.

Then you can go into the different folders and execute the script It renames each eml file with the date and subject in the filename. When I used it it did sometimes throw the following error:

for that reason I updated the original script to be able to see the filename that cause the issue (it were in fact mails without date information) and to remove them. You find the updated script here:

If you have issues running the script, you might need to install the following packages: perl-File-Slurp, perl-File-Next and perl-DateTime-Format-Flexible


Compare two excel files

There is a hidden tool if you are running a Professional Version of office. You find it here:

Depending on the version of office that you are running the folder “Office15” can have another number. It allows you to select to files and visualize them side-by-side with a nice graphical overview of the differences

PDF OCR with Fedora 24 and Tesseract

Run the following commands:

Now you can convert a file like this:

If you don’t install the tesseract-osd package, it will work but the following error message appears:

Mount Amazon S3 on Fedora 24

There is no package that is ready to be installed. You need to download and compile the code yourself. First you need to install some development libraries. Execute the following commands:

Then you need to create the directory where you want to mount your bucket:

Now you need to prepare your credentials. The AwsAccessKeyId as well as the AwsSecretAccessKey is needed:

Now you can mount your bucket:

Unfortunately if something goes wrong (for example wrong credentials) it doesn’t show you a error message. The folder is just empty. In such a case you can run the debug mode of the command to see more clearly what is going on: