d3b-center/d3b-scala

Name: d3b-scala

Owner: Center for Data Driven Discovery in Biomedicine

Description: null

Created: 2017-04-04 13:48:57.0

Updated: 2017-05-30 03:30:33.0

Pushed: 2017-05-27 16:59:10.0

Homepage: null

Size: 14

Language: Scala

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

A bunch of examples of usage. Not meant to be comprehensive, but grow organically.

I typically refer to my usage of Scala as “Scalite”, due to the fact that I only focus on the features/libraries that have proven to really pay off over time. Using other features is strongly discouraged, although there has been a few instances where I “promoted” a feature initially discouraged after finding an intereting use case for it (for instance the <:< class, under very specific circumstances - TODO: show example)

Getting scala and sbt

see this example for more specific instructions on how to get setup on a specific project

SBT is the go-to Scala build tool

Getting ScalaIDE and SBT eclipse plug-in

ScalaIDE is an eclipse spin-off maintained by Lightbend themselves. The SBT eclipse plug-in creates all the eclipse-related files necessary for an easy Import existing project import in Eclipse.

IntelliJ

If eclipse is not to your liking, IntelliJ is another widely used IDE with an scala extension. Here is a helpful quickstart.

entry point (main)

There are two ways to define the entry point in scala:

main method
ct MyDriverObject {
f main(args: Array[String]): Unit = {
...


or with less boilerplate:

ct Nebulis {  
f main(args: String*) {

 

App trait
ct Nebulis extends App {
.

scalite

In scalite we favor the main approach for three reasons:

With regard to the third point, consider the following example

ct Nebulis extends App {

. // main expressions go here directly (not ideal)

l SomeVal = ...

f someMethod = {
...



I/O

I/O support in scala is rather bare and mostly falls back on java's (which has gotten much better recently). There are however talks of including the better-files scala library as part of the standard library.

I typically include the snippets below in a utils object for convenience (more on that in the implicit class section). note that all the utils libraries like apache-commons and guava are also available if needs be.

read file

if small enough:

content: String = {
val src = io.Source.fromFile("/path/to/file")
val content = src.mkString
src.close()

content

if potentially big, then can use lazy evaluation:

lines: Iterable[String] = {
val src = io.Source.fromFile("/path/to/file")
val lines = src.getLines()
src.close()

lines.toIterable

write file

if small enough:

content: String = "my content"
fw = new java.io.FileWriter("/path/to/file")
.write(content)
.close()

if potentially big:

lines: Iterable[String] = Iterable("my line 1", "my line 2", "my line 3")
fw = new java.io.FileWriter("/path/to/file")
nes
.map(line => s"${line}\n") // because there is unfortunately no fw.writeLn
.foreach(fw.write)
.close()

Pattern matching

Example 1

how to convert 0/1 into boolean and make sure it errors out if you encounter any other value

ment(key) match {
            case "0" => false
            case "1" => true
            // this will throw a MatchError otherwise
  }

in java you'd have to explictely throw some kind of exception or assert(false) to get the same effect

Enumerations (enum)

“enum"s are a very Java-esque feature, only really needed because the language lacks a construct for true singletons (see “Singleton” GoF pattern).

There are two ways to reproduce enums in Scala:

I looks something like this:

ed trait MyEnum

 object MyEnum1 extends MyEnum
 object MyEnum2 extends MyEnum

The nice thing with that is that unlike with Java enums (or the Enumeration approach for that matter), if at some point one of your “instances” does require parameters (and is therefore no longer really an enum instance), you haven't painted yourself in a corner:

ed trait MyEnum

 object MyEnum1 extends MyEnum
 object MyEnum2 extends MyEnum
 class MyEnum3(someParam: String) extends MyEnum

Because Scala stands for “Scalable Language”, in the sense that libraries can be built to make it “scale” to your needs (as opposed to providing language-level constructs), there are libraries that can help. My favorite is “Enumeratum”, thanks to which you can also get all the goodies like .valueOf and .values (and in fact more goodies than is offered by the Java enum keyword). Basically a pure object oriented language that can case analyse (“pattern matching” in scala) does not need an enum construct per se, it can just offer it as a library.

Lastly and maybe the most important part, is that you can then process the enum values like so:

um match {
se MyEnum1 => ...
se MyEnum2 => ...
se MyEnum3(s) => ... (s has been bound to the value already)

And on top of it, it's semi-typesafe! if you forget a case, you'll get a compiler warning (hence the “semi”).

Json

There are multiple libraries availabe in scala and java to process Json:

play-json

like most JSON libraries on the JVM, there are three ways to use it:

  1. “tree” model: using the JsValue hierarchy
  2. simple data binding: the naive way, as nested Maps and Seqs (not recommended)
  3. full data binding: converting back and forth with case classes

I tend to use (1) for easy cases and (3) for more complex ones, although (3) gets harder to use the looser the underlying schema is.

setup

add this to your build.sbt file:

.typesafe.play" %% "play-json" % "2.5.10" withSources() withJavadoc()

then in your source files

rt play.api.libs.json._

or better yet:

rt play.api.libs.json.{Json, JsObject, JsString, <... as needed>}

caution: watch out for the play.libs.Json object, an unfortunate choice of the library designers to keep

tree model
Value match { // JsValue is sealed, so this is typesafe (warning only though)
se    JsNull    => println(null)
se b: JsBoolean => println(b)
se n: JsNumber  => println(n)
se s: JsString  => println(s)
se o: JsObject  => println(o)
se a: JsArray   => println(a)

manually:

ject
myJsObject: JsObject =
on.obj(
"k1" -> v1,
"k2" -> Seq("a", "b", "c")
...)

rray
myJsArray: JsArray =
on.arr("v1", "v2", ...)

more typically (ie when transforming data):

bject
myPairs: Seq[(String, JsValue)] = ...
myJsObject: JsObject = JsoObject(myPairs)

rray
myJsValues: Seq[JsValue] = ...
myJsArray: JsArray = JsArray(myJsValues)
bject
Object
.fields
.map { case (keyString, jsValue) =>
    ...
}

rray
Array
.value
.map { jsValue => // typically proceed with pattern matching on jsValue
    ...
}
full data binding

WIP


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.