Bartosz Bierkowski - Low dose cloud

Scripting in Scala

Recently I needed to try out some ideas around web scraping. The app was a regular web scraper that downloads some pages, extracts data and saves them to disk. I wanted to just occasionally run it from shell to gather output, without too much thinking on how to launch it.

Where I want to get

As always there is a choice of technology to be made and immediately bash/python came to my mind, as most of the tools are already installed on the system. On the other hand I needed to run a quick proof of concept, that I could later turn into an application. I have decided to revisit then what can be used to write Scala like shell scripts.

First approach

First solution that I have seen already a while ago is described at https://alvinalexander.com/scala/scala-shell-script-example-exec-syntax. It uses a plain shell mechanism to execute Scala code directly.

I like the approach to play with plain Scala. And thanks to this post I also learned how to execute similar things directly from shell scripts.

In my scenario I need also some libraries to handle http, html, files. The shell script approach does not support any dependency management, so it’s not the best solution for this problem.

Last approach

So, I did spend too much time on researching possible ways to solve it or hacking around shell scripts. My next candidate for the quick implementation – right before going towards the full sbt project – was to finally check out Ammonite Scala Scripts. I have never used it before. It is famous for its REPL (Read-Evaluate-Print Loop) and in this project I wanted to check the Scala Scripts.

Ammonite Scala Scripts

For the single line installation instruction, please see the link above with the official docs.

The mandatory first Ammonite script looks like that.

It is already visible, that no class, nor object is required to write a script. This is a nice setup to start with. The compilation is done directly by executing amm command on the created script.

And it works! Well, this is a very good experience. So, let’s see how far it can be pushed before switching to sbt.

The @main annotation helps in creating normal entry point to the application. The function can have any name thanks to the usage of annotation. Without it, the code compiles, but nothing is executed.

Also the compilation itself takes couple seconds, but afterwards the execution is only 1 second. Still the code/execute cycle is super fast compared to other approaches.

Now, time to add some dependencies. The syntax is a bit different than for SBT. It is a pity, as all libraries and repositories display it in <group>%%<artifact>%<version>. The conversion is straightforward though. Just replace %  with : . Also, no quotes are required.

The $ivy  imports below define the dependencies from artifact repositories. Also $fil e and $cp  (classpath) are supported.

Here another surprise was waiting for me. The artifacts are resolved in multiple threads. Each download displays separate progress bar and the compilation of such application feels much more dynamic.

Coursier is the library responsible for dependency resolution in ammonite. I heard of it before, but never knew how much I missed it. Glad to learn about it, as it can be also used in SBT.

Back to main topic. Having the dependencies simplifies the whole application. Honestly just to test Ammonite and the parsing of the pages only the first dependency is enough. JSoup can handle it easily, but who does not like lots of progress bars downloading dependencies in parallel!

To simplify things let me focus on JSoup only. The basic example fetches the page with list of OpenShift Morsels and extracts all links. Then the links are printed out to screen.

Full code ready to run without unnecessary examples looks like this:

It is a tiny tiny amount of code. There is some noise from interoperability with Java, which with more usages can be wrapped in some function.

Running the application yields the titles:

At this point adding more functionality is super simple. Let’s print out part of the title in blue using another library – https://github.com/lihaoyi/fansi

Again the full code, since it is so short. The only difference is additional dependency and function coloring part of the link text.

I am not going further with the post. I found Ammonite very useful in quickly testing short scripts or writing tiny tools. Some of them I turn into full projects based on SBT.

Documentation

You have an opportunity to support Haoyi

Ammonite described here together with other libraries is not the only work Haoyi is doing for Scala community. His open source work comes with excellent documentation and lots of examples – just check the links above.

He is creating many useful libraries, demonstrates different approaches to solving problems and his blog posts explain the reasoning behind it.

I truly admire his work in this area. Additionally he speaks at conferences and is an active community member.

Couple months ago he gave us – the community – an opportunity to support his open source efforts. This is really great news for everyone. I believe that through this support he can focus more effort on creating more interesting tools and learning materials.

You can visit his Patreon page and become a patron of his work: https://www.patreon.com/lihaoyi/overview I already did that and I would be glad if you do that too!

Support me

If you would like to support me, please consider joining the newsletter. It makes me very happy to see a growing list of readers.

Did you like the article?
Join the newsletter to receive notifications about new articles.
I respect your privacy.