Speller Minimum Implementation

Go has built in support for writing a http based service. This makes it incredibly easy to start up a simple REST service such as the one we’re writing.

We have two basic parts for this simple version of the project.

  • Read the file into some sort of data structure
  • Provide the REST api server

Read the file

We want to read the file into a datastructure that provides quick lookup for existence. This pretty much begs for a map but let’s discuss possibilities.

  • Just search the file each access
    • This will create an I/O bottleneck for ever access. Doable, but too enificient.
  • Store it in a sorted list
    • Using a binary search, we can get O(log n) searches. That’s not too bad and we likely save memory this way (exercise for a future blog post, prehaps.) In practice, the memory savings probably isn’t that great.
  • Store it in a tree
    • We don’t have to sort it, as long as the data is not presorted. We get good search times, but memory probably isn’t much better than just using a map.
  • Use a map of string to bool
    • Quick look ups O(1) “ish.”
      • Remember, most maps are buckets of linked lists so it’s not true O(1) but it’s close enough. It’s still good to remember that fact because it may burn you at some point in your career.
    • Some memory inefieciencies since every entry is a (string,bool) pair. However, the string portion is the bulk of the memory used.

The code to read the file

14
15
16
17
18
19
20
21
22
23
24
25
26
wordMap := map[string]bool{}
file, err := os.Open("words.txt")
if err != nil {
    log.Fatal(err)
}
defer func() {
    file.Close()
}()
scanner := bufio.NewScanner(file)
scanner.Split(bufio.ScanLines)
for scanner.Scan() {
    wordMap[strings.ToLower(scanner.Text())] = true
}

Pretty simple code. We open the file to read, fail on any error because we can’t do anything if don't have words to parse. The code could be a little more succent by closing the file explicitly after we read it but the defer is a good paradigm to get used to using.

Serve the endpoint

The net/http and encoding/json packages provide everything we need to serve JSON endpoints. We need to provide a handler that will listen to requests of a certain type and then do some processing to return the results. For our simple example, we will just listen for all requests using the pattern of “/” for all requests. Anything is the request URI will be treated as the word to lookup.

Handler code

28
29
30
31
32
33
34
35
36
37
38
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
    word := strings.ToLower(r.RequestURI)[1:] // strip the leading /
    if len(word) == 0 {
        w.WriteHeader(http.StatusBadRequest)
        return
    }
    found := wordMap[word]
    w.Header().Add("Content-Type", "application/json")
    w.WriteHeader(http.StatusOK)
    json.NewEncoder(w).Encode(found)
})

A couple of things of note. We use a slice operation to strip the leading slash on line 29. In other languages (such asJava) it isn’t quite as concise. Another thing is that we need to handle the error case for when a request comes in without a word. You might be tempted to use an if-else for the empty word vs actual lookup but idiomatic go prefers the quick return.

Server Code

The actual code to listen to an endpoint is trivial (for this simple implementation.)

39
40
41
if err := http.ListenAndServe(":8080", nil); err != nil {
    log.Fatalf("Error starting: %v", err)
}

This spins up the http server to listen on port 8080 on all local addresses. Which is fine for simple projects but we may want to listen on a different port or only on a limitted set of local IP addresses if we are in a multihomed environment. Also note that we do not serve a secure endpoint (https) here. One would be tempted to say that for this project, we don't need a secure endpoint. Which is likely true, however, one should always think about security and get into the habit of writing secure endpoints more as the rule than the exception.

Final Thoughts

So we spun up a simple rest endpoint and we can call it good, right? Where’s the fun in that? There are more steps to complete before we can call this “production ready.” We need to add a few things:

  • Single binary
    • For ease of deployment. Right now there are two files needed, the executable and the word file.
  • Logging
    • Need to know what is happening
  • Configurability
    • Should be able to configure the port and listening address
  • Automated Build
    • For ease of use in a CI/CD environment
  • Scaling
    • How would this scale easily?
    • We’re going to use kubernetes.
  • Maybe other languages?

Code for this post can be found at: https://github.com/ericfialkowski/speller/tree/Step1