Scala Beginners: counting words and lines
A simple command-line tool
At work, I’m teaching one intern and one colleague how to program.
I gave them the following exercise:
Write an utility that takes 3 command-line parameters P1, P2 and P3. P3 is OPTIONAL (see below) P1 is always a file path/name. P2 can take the values:
- “lines”
- “words”
- “find”
Only P2 is “find”, then P3 is relevant/needed, otherwise it is not.
So, the utility does the following:
- If P2 is “rows” it says how many lines it has
- If P2 is “words” it says how many words it has (the complete file)
- If P2 is “find” it prints out the lines where P3 is present
Examples:
Given a file myfile.txt with this content (without the line-numbers):
01: Scala is a statically typed 02: language released in 2004. 04: It incorporates language 05: features found in Ruby, 06: Haskell and Java, among 07: others.Then, if the utility is called “mytool”, then:
mytool myfile.txt lines 7 mytool myfile.txt words 21 mytool myfile.txt find language language released in 2004. It incorporates language
I asked them to solve it in Ruby, since I think it might be easier for them to start.
How would you solve this in Scala?
If you’re learning Scala, I encourage you to try for yourself first instead of reading my solution. It’s simply the best way to learn.
Feel free to ask questions in the comments section.
If you already tried, are stuck or are just outright impatient, read on…
Solution in Scala
Here I present you my solution:
Some highlights:
Matching the command-line arguments
The naive approach:
if(args.size == 2) {
  if(args(1) == "lines") {
    countLines(args(0))
  } else if(args(1) == "words") {
    countWords(args(0))
  } else {
    printUsage()
  }
} else {
  if(args(1) == "find") {
    findWord(args(0), args(2))
  } else {
    printUsage()
  }
}
The better approach:
args match {
  case Array(file, "words") => countWords(file)
  case Array(file, "lines") => countLines(file)
  case Array(file, "find", wordToFind) => findWord(file, wordToFind)
  case _ => printUsage()
}
Things to notice:
- How much simpler and elegant it is
- How we match and bind variables
- How we match against string-literals
Counting lines
def countLines(file: String) {
  val src = Source.fromFile(file)
  val count = src.getLines.size
  println(count)
}
Things to notice:
- The use of SourceandgetLinesto read the lines of a file
Counting words
def countWords(file: String) {
  val src = Source.fromFile(file)
  val count =
    (for {
      line <- src.getLines
    } yield {
      val words = line.split("\\s+")
      words.size
    }).sum
  println(count)
}
Things to notice:
- How we yield the amount of words per line
- How we sum all word-amounts with sum
- How we split words by one or more whitespace characters (space, tab, etc.)
Finding words
def findWord(file: String, wordToFind: String) {
  val src = Source.fromFile(file)
  for {
    (line, idx) <- src.getLines.zipWithIndex
    if line.contains(wordToFind)
  } {
    println(f"$idx%02d: " + line)
  }
}
Things to notice:
- The use of zipWithIndexto have the index of the current element in a loop
- The guard/condition in the for loop, which filters the lines we want
- The use of the formatting string-interpolator
Drop me a line and tell me what you think! :-)