Code Challenge: Using ASCII & Unicode with Swift / by Wayne Bishop

Most technical interviews have at least one question related to String manipulation. As we know, working with Strings and textual data is a cornerstone of practically every app. In this code challenge, our goal is to write a simple function to determine if an input String contains all unique characters:

 
//challenge: write function to determine if an input String (e.g element) contains all unique characters).  


func isStringUnique(element: String) -> Bool {
    //code goes here..
}

To start, we've received a preliminary function signature. The idea is that isStringUnique() will accept a single String and will return a Boolean value. As an iOS developer, note that our function doesn't return an Optional. As a result, we'll need to account for each permutation where our algorithm could return a true or false value.

 

THE CONVENTIONAL APPROACH

In a previous essay, I note there are often many ways to approach a problem. For our challenge, one method is to apply a brute force technique. The idea? Sequentially check each String character against every other character to determine if there's a match. In Swift, this can be accomplished through a series of nested loop statements. Consider the following:

func isStringUnique(element: String) -> Bool {
    
    let alist = Array(element)
    let blist = alist
    
    //compare each character with every other character
    for a in 0..<alist.count {
        for b in 0..<blist.count {
            if (b != a) && (String(blist[b]) == String(alist[a])) {
               return false
            }
        }
    }
    return true
}

Even though our solution applies a brute force method, it does provide some advantages. Primarily, it does answer the question  and produces a consistent result. A non-obvious benefit is also its syntax. Most developers could glance at the code and have a general idea as to its logic. As such, the syntax is not overly complicated, making it relatively straightforward to recall in an actual interview situation. 

 

PERFORMANCE ANALYSIS

Testing isStringUnique() for algorithmic performance, however, reveals various issues. What's apparent is the number of times each character is checked before a true or false conclusion is reached. Evaluating the function with an exciting sampling of words reveals the following:

 
brute-force-algorithm.png
 

 

As shown, the algorithm must perform a significant number of checks to conclude an outcome. Since words like Rhinoceros don't contain 49 characters, it's clear that multiple letter combinations get rechecked. However, the most significant concern is Uncopyrightable, running 225 iterations before returning a result. Beyond its apparent performance, the function also lacks support for trivial base cases. In other words, do necessary conditions exist that would help us quickly conclude a true or false result? Regardless, it would be ideal to optimize the solution to check each character once - thus improving its average running time to O(n) - linear time. 

 

THE UNICODE APPROACH

As the essay title implies, let's rethink our solution to make use of Unicode. Beyond iOS development, folks that have coded HTML pages will no doubt be familiar with writing escape characters (e.g. &#65;) to get things to work. Overall, it is surprisingly useful to have an alternate way to interpret letters and special characters with integer values. As such, modern programming languages (e.g., Swift) have not only adopted ASCII, but its Unicode successor to support Emoji, Chinese/Japenese characters and beyond. Let's use these capabilities to refactor our solution. Consider the following: 

func isStringUnique(element: String) -> Bool {
    
    //evaluate trivial case
    guard element.characters.count < 128 else {
        return false
    }
    
    //match unicode representation - O(n)
    var list = Array<Bool?>(repeatElement(nil, count: 128))
    
    for scalar in element.unicodeScalars {
        let unicode = Int(scalar.value)

        if list[unicode] != nil {
            return false
        }
        list[unicode] = true
    }    
    
    return true
}

To understand our new function, let's start with the initial guard statement. As discussed, we now have a simple base case to account for edge cases. That is, any word having more than 128 characters must include at least one value that repeats (since the basic ASCII table supports 128 unique values).

 

Next, we optimize an Array to track the occurrence of each Unicode character. To streamline the process, we use each character's Unicode value as an index to store a Boolean. For example, the Unicode value for "R" is represented with "82". Since Array indices are unique, each character is checked once, thus achieving our average running goal of O(n) - linear time. A performance comparison of the two algorithms reveals the following: 

 
unicode-algorithm.png
 

 

Working with Unicode makes it straightforward to represent textual data with numbers. As a result, this avenue of development opens the way for learning encryption/hashing techniques as well as popular data structures such as Hash Tables and Blockchains


Learning iOS for your next project or technical interview? Most successful apps combine language features, tools and various techniques to solve a specific problem. Based on the implementation of my own projects, use this free worksheet to plan your learning path.