Hybrid iOS app’s

Native iOS app’s are built with the platform SDK provided by Apple (Xcode/Objective-C / Swift). The main advantage of native applications is their performance because they are compiled into machine code.

Mobile Web iOS app’s are built with server-side technologies (PHP, Node.js, ASP.NET) that are mobile-optimized to render HTML well on iPhones and iPads. Mobile app’s load within the mobile browser Safari like every other website. They generally require an Internet connection to work, but it’s also possible to use cache technologies to use them off-line and you can install them with an icon on the iOS home screen. Mobile Web app’s for iOS are faster and easier to develop and to maintain than native iOS app’s.

Hybrid iOS app’s

Hybrid iOS app’s are somewhere between native and web app’s. The bulk of hybrid iOS app’s is written with cross-compatible web technologies (HTML5, CSS and JavaScript), the same languages used to write web app’s. Hybrid app’s run inside a native container on the device using the UIWebView or WKWebView classes.  Some native code is used however to allow the app to access the wider functionality of the device and to produce a more refined user experience. The advantage of this approach is obvious: only a portion of native code has to be re-written to make the app work on another type of device, for instance on Android, Blackberry or Windows Mobile.

Some developer consider hybrid iOS app’s as beeing the best of both worlds. Great examples of hybrid iOS app’s are Facebook and LinkedIn.

Hybrid iOS app tools

There are numerous tools available to create hybrid app’s, not only for iOS, but for all mobile platforms. The most important tools to embed HTML5, CSS and Javascript in mobile app’s are :

  • PhoneGap : a mobile development framework created by Nitobi, which was purchased by Adobe in 2011. The latest version is 4.0.0.
  • Apache Cordova : a free and open-source platform for building native mobile applications using HTML, CSS and JavaScript, underlying PhoneGap. Apache Cordova graduated in October 2012 as a top level project within the Apache Software Foundation (ASF). The latest version is 5.0.0.
  • BridgeIt : an open source mobile technology sponsored by ICEsoft Technologies Inc.
  • MoSync : free, easy-to-use, open-source tools for building cross-platform mobile apps (MoSync Reload 1.1 and MoSync SDK 3.3.1).
  • Ionic : the final release of Ionic 1.0.0 (uranium-unicorn) was released on May 12, 2015.
  • CocoonJS : a platform to test, accelerate, deploy and monetize your HTML5 apps and games on all mobile devices with many interesting features to help you deliver great web products faster.
  • ChocolateChip-UI : the tool uses standard Web technologies and is built on top of the popular & familiar jQuery library.
  • Sencha-Touch : the leading (very expensive) cross-platform mobile web application framework based on HTML5 and JavaScript for creating universal mobile apps.
  • Kendo UI : everything for building web and mobile apps with HTML5 and JavaScript.
  • AppGyver (Steroids, Supersonic, Composer) : a combination of great CSS defaults (from the Ionic Framework) with the best native APIs (from PhoneGap), adding a boatload of extra features on top.

There are also some tools and platforms available which use other technologies than HTML5 +CSS + Javascript :

  • Titanium : write in JavaScript, run native everywhere at the speed of mobile, with no hybrid compromises.
  • Xamarin : share the same C# code on iOS, Android, Windows, Mac and more.
  • RhoMobile Suite : gain the ease of consumer smartphone platforms, without sacrificing the critical data functionality of enterprise solutions.
  • Corona : designed to enable super-fast development. You develop in Lua, a fast and easy-to-learn language with elegant API’s and powerful workflows.
  • Meteor : discover the easiest way to build amazing web and mobile apps in JavaScript.

WebViews

The tools to create hybrid app’s give you access to the underlying operating system calls or to underlying hardware functionality. Howver you do not really need these tools if you want only deploy a web app as a native app. All you need is the bare minimal code to launch a Web View and load your Web App as local resources.

UIWebView

To embed web content in an iOS application you can use the UIWebView class. To do so, you simply create a UIWebView object, attach it to a window, and send it a request to load web content. You can also use this class to move back and forward in the history of webpages, and you can even set some web content properties programmatically.

The steps to create a minimal demo to load this web content from an external webserver are the following :

  1. Create an Xcode Swift project with a single view application.
  2. Open the Main.storyboard file and drag the WebView UI element from the object library to the Interface Builder view.
  3. Open the assistant editor to display the ViewController.swift file, ctrl-click-drag the WebView behind the class ViewController text line to create an IBOutlet named mywebView :
     @IBOutlet weak var mywebView: UIWebView!
  4. Define the url of the remote website :
    let url = "http://www.canchy.eu"
  5. Add the following code to the viewDidLoad function :
    let requestURL = NSURL(string:url)
    let request = NSURLRequest(URL: requestURL!)
    mywebView.loadRequest(request)
  6. Run the project :
Remote

Remote website displayed in iOS Simulator

The website loads, but it looks awful. The reason is that the UIWebView doesn’t handle things that Safari does by default. Telling the program to scale pages is easy. We need to add the following line of code :

mywebView.scalesPageToFit = true

The whole script of the ViewController.swift file looks now as follows :

View

ViewController.swift script for RemoteWeb

The layout is still not perfect, but you can now pinch the view to zoom in and out. Holding down the alt key and click-dragging the view allows you to perform this action in the simulator. We will enhance the scaling in a further chapter.

Note that for a very simple app like the present one it’s not necessary to import the webkit framework. UIWebView has about a dozen methods you can use, but to get about 160 methods you need the webkit classes.

Project

Project window with local web content

To include the web content into the iOS app, we add the folder canchy with html files, css files, javascript files, images and photos into the project window.

The three first steps to create a minimal demo to load local web content are the same as those for the remote content.

The next steps are the following :

4. Define the path for the local content with the NSBundle.mainBundle() method :

  • pathForResource
  • ofType
  • inDirectory

5. Create an URL object with this path with the NSURL.fileURLWithPath() method

6. Create a Request object with this URL with the NSURLRequest() method.

7. Scale the page with the scalesPageToFit method.

8. Load the Request object with the loadRequest() method.

9. Run the project.

The result is the same as with the remote content. The layout is not adapted to the screen. We will fix that in the last chapter.

The complete script of the ViewController.swift file is shown below :

 

local

ViewController.swift script for local web content

WKWebView

With iOS 8 and Mac OSX Yosemite, Apple released the new WKWebView class, which is faster and more performant than the UIWebView class. WKWebView uses the same Nitro Javascript engine as the Safari browser.

To create an hybrid iOS app with the WKWebView class, the procedure to access content on an external web-server is the same as for the UIWebView class. Only the code in the ViewController.swift file is slightly different.

WK

ViewController.swift script for remote web content with WKWebView

We need to import the WebKit framework, because the WKWebView class is now part of the WebKit itself.

Remote

Remote website displayed in iOS Simulator

By default the website display in WKWebView is better adapted to the screen size,  compared to the UIWebView.

The worst issue of the WKWebView class is that it doesn’t allow loading local files from the file:// protocol like the UIWebView class. There are some workarounds available, most of them use an embedded local HTTP server to serve the local files. The most renowned solutions are the following :

XWebView uses a very tiny embedded http server, whereas the Cordova WKWebView Engine uses the more versatile GCDWebServer.

GCDWebServer

GCDWebServer is a modern and lightweight GCD (Grand Central Dispatch) based HTTP 1.1 server designed to be embedded in OS X & iOS app’s. It was written from scratch by Pierre-Olivier Latour (swisspol) with the following features :

  • four core classes: server, connection, request and response
  • no dependencies on third-party source code
  • full support for both IPv4 and IPv6
  • HTTP compression with gzip
  • chunked transfer encoding
  • JSON parsing and serialization
  • fully asynchronous handling of incoming HTTP requests

WKWebView versus UIWebView

Both WKWebView and UIWebView in iOS8 support WebGL which allows rendering 3D and 2D graphics on WebView. The performance gains are significant (greater than 2x) in WebGL applications when using WKWebView, compared to UIWebView.

You can use the WebView Rendering App available in Apple AppStore to test any website performance between UIWebView and WkWebView. An In-App debug console for your UIWebView & WKWebView classes is available at Github.

CORS

If a javascript on a webpage displayed in UIWebView or WKWebView is running from domain abc.com and would like to request a resource via AJAX (XmlHttpRequest or XDomainRequest) from domain efg.com, this is a cross-origin request. Historically, for security reasons, these types of requests have been prohibited by browsers.

The Cross Origin Resource Sharing (CORS) specification, now a W3C recommendation, provides a browser-supported mechanism to make web requests to another domain in a secure manner. To allow AJAX requests across domain boundaries, you need to add the following headers to your server :

Access-Control-Allow-Origin.*
Access-Control-Allow-Headers : Accept, Origin, Content-Type
Access-Control-Allow-Methods : GET, POST, PUT, DELETE, OPTION

Size the web views

Xcode provides an auto layout function. By clicking the second icon (pin) from the left in the storyboard, a popup window opens to set the spacing contraints to the nearest neighbours. By clicking the dotted red lines leading from the small square at the top to four textboxes, you can define these constraints. I set the top border to 20 points and the right and left border to 16 points.

Constraints

Adding constraints to execute the auto layout function in the storyboard

By clicking the button “Add 3 Constraints” the values are entered and displayed with the selected values in the View Controller Scene.

Constraints

Viewing the layout constraints in the View Controller Scene

The result is a correct display of the webpage in the iOS Simulator with a free toolbar of 20 points at the top of the web view.

Display

Webpage view with auto layout

Links

iOS Xcode Project

Last update : September 18, 2015

Xcode Project

This topic refers to my recent post about iOS Development and provides more detailed informations about iOS Xcode 6.3 projects. An Xcode project is the source for an iOS app; it’s the entire collecton of files and settings needed to construct the app. An iOS app is based on event-driven programming.

To create a new project in Xcode, the menu File -> New -> Project leads to a dialog window to chose a project template.

new

Template selection for a new Xcode project

The Single View Application in the iOS templates is a good starting point to create a simple app, for example the classic HelloWorld demo.

Helloworld

Xcode HelloWorld Demo Project

The name of the app (HelloWorld) is entered as Product Name. Spaces are legal in an app name, we could also name the project Hello World. Other punctuations in the product name can however lead to problems. The product name is used by Xcode to fill in the blank in several places, including some filenames.

My name is entered as Organization Name and my website domain name web3.lu is used as Organization Identifier. The unique bundle Identifier is automatically set to web3.lu.HelloWorld by the system.

The selected language is Objective-C, the other possibility is Swift. This choice is not binding, you can mix Objective-C and Swift code files. The selected device is iPhone, other possibilities are iPad or universal for both iPhone and iPad. The flag “Use Core Data” remains unchecked in a simple project.

Clicking the next button shows a file selection window where you can select or create a location (an Xcode workspace folder) and place the project under version control in a Git repository, which is however design overkill for a small tutorial project.  Clicking the create button generates the skeleton of a simple app, with precompiled files and settings, based on the selected template. If you have not yet an Apple Development Account, a warning message says “no signing identity found”.

Xcode Project Window

The different files and parameters of an Xcode project can be classified as :

  • source files that are to be compiled
  • .storyboard or .xib files, graphically describing interface objects
  • resources such as icons, images, sounds, …
  • settings : instructions to the compiler and linker
  • frameworks with system classes

This is a lot of embodied information that is presented in the Xcode IDE Project Window in graphical form. This window can be configured in the following parts to let you access, edit and navigate your code and to get reports about the progress of building and debugging an app :

  • Menu Bar
  • Tool Bar
  • Left-hand Sidebar : Navigator Pane
  • Right-hand Sidebar : Utilities Pane
  • Middle main area : Editor Pane, or simply “Editor”
  • Middle bottom area : Debugger Pane
Xcode Project Window

Xcode Project Window

A project widow is powerful and elaborate. The different panes and subviews can be shown or hidden inside the View Menu with the corresponding submenus. The width and height of the panes can be changed by dragging the edges. Keyboard shortcuts can be associated to panes to show, hide or switch them easily. Usually all these parts are not displayed at the same time, except very briefly in rather an extreme manner. By control-clicking a area in the Xcode project window, a shortcut menu appears to select help articles.

Menu Bar

The Menu Bar features the following menus :

  • Xcode
  • File
  • Edit
  • View
  • Find
  • Navigate
  • Editor
  • Product
  • Debug
  • Source Control
  • Window
  • Help
Xcode IDE Menubar

Xcode IDE Menubar

Toolbar

The toolbar features the following parts :

  • a build / run and stop button
  • a scheme and destination selector (pop-up menu)
  • an activity viewer (report field)
  • three icons to select the standard, assistant or version editor
  • three icons to show / hide the navigator, utilities or debug areas
Xcode IDE Toolbar

Xcode IDE Toolbar

Navigator Pane

xcode_navigationThe primary mechanism of the navigator pane at left-hand is for controlling what you see in the main area of the project window. You select something in the Navigator pane, and that thing is displayed in the editor of the project window. The Navigator pane itself can display eight different sets of information; thus, there are actually eight navigators represented by the eight icons across its top.

  • Project Navigator : basic navigation through the files of a project
  • Symbol (Hierarchical) Navigator : a symbol is typically the name of a class or method
  • Find (Search) Navigator : to find text globally, even in frameworks
  • Issue Navigator : needed when the code has issues
  • Test Navigator : list test files and run individual test methods
  • Debug Navigator : to track possible misbehavior of the app
  • Breakpoint Navigator : lists all the breakpoints
  • Report (Log) Navigator : lists the recent major actions, such as building or running

At the bottom of the navigator pane is a filter which lets you limit what files are shown. Below the search filed is the current search scope.

 

Utilities Pane

The utilities pane at right-hand contains inspectors in the top half that provide information about the current selection or its settings; in some cases, these inspectors let you change those settings. There are four main cases :

  • a code file is edited : you can toggle between File Inspector and Quick Help.
  • a storyboard file is edited : in addition to the File Inspector and the Quick Help tabs, the following additional inspectors are available : Identity, Attributes, Size and Connections.
  • an asset catalog is edited : in addition to the File Inspector and the Quick Help tabs, the Attribute Inspector lets you define which variants of an image are listed.
  • the view hierarchy is being debugged : in addition to the File Inspector and the Quick Help tabs, the Object Inspector and the Size Inspector are available.
Xcode IDE Inspectors (5 examples)

Xcode IDE Inspectors (5 examples)

In the bottom half the utilities pane contains four libraries that function as a source of objects you may need while editing your project :

  • File Template Library
  • Code Snippet Library
  • Object Library
  • Media Library

You can toggle between these four libraries. The object library is the most important. Press spacebar to see help pop-ups describing a library item.

Xcode IDE libraries (4 examples)

Xcode IDE libraries (4 examples)

Editor

In the middle of the project window is the editor. This is where you get actual work done, reading and writing your code or designing your interface in a .storyboard or .xib file. The editor is the core of the project window. You can hide the navigator pane, the utilities pane, and the debug pane, but there is no such thing as a project window without an editor.

The editor provides its own form of navigation, the jump bar across the top which show you hierarchically what file is currently being edited. The jump bar allows you to switch to a different file and it displays also a pop-up menu, which has also a filter field.

Xcode IDE Editor jumpbar

Xcode IDE Editor jumpbar

To edit more than one file simultaneously, or to obtain multiple views of a single file, you can split the editor window using assistants, tabs and secondary windows. The menus View -> Navigators and File -> Tab or File -> Window allow to activate the splitting. You can determine how assistant panes are to be arranged with the Menu View -> Assistant Editor submenu. The assistant pane bears a special relationship to the primary editor pane (tracking).

Xcode IDE Editor

Xcode IDE Editor

Debugger

The debug navigator displays the call stacks of a paused app. You can debug Swift or C-based code as well as OpenGL frames. The debug navigator opens automatically whenever you pause your application (by choosing menu Debug -> Pause), or if it hits a breakpoint.

Xcode IDE Debugger

Xcode IDE Debugger

The debug navigator shows threads with the following icons :

  • No icon means the thread is running normally
  • A yellow status icon means that the thread is blocked and waiting on a lock or condition
  • A red status icon means that you suspended the thread. A suspended thread does not execute code when you resume your application
  • A gray icon means that thread or block is part of the recorded backtrace and is not currently executing

Anatomy of an Xcode Project

The first item in the Project Navigator represents the project itself. The most important file (archive, folder) in the project folder is HelloWorld.xcodeproj.  All the Xcode knowledge about the project is stored in this file.

Based on the iOS SDK 8.3, a project HelloWorld has two targets :
HelloWorld and HelloWorldTests.
A target is a collection of parts along with rules and settings for how to build a product from them. The test target is a special executable whose purpose is to test the app’s code.

The HelloWorld target includes the following files :

  • AppDelegate class with .h and .m files
  • ViewController class with .h and .m files
  • Main.storyboard file
  • Images.xcassets folder with icons (asset catalog)
  • LaunchScreen.xib file
  • Info.plist file
  • a subfolder Supporting Files with a main.m file and an Info.plist file

The HelloWorldTests target includes the following files :

  • HelloWorldTests.m implementation file
  • a subfolder Supporting Files with an Info.plist file

It’s important to know that groups (like the group Supporting Files) in the project navigator don’t necessarily correspond to folders in the project directory. The Supporting Files group is just a way to clump some items together in the project navigator to locate them easily. To make a new group, use the Menu File -> New -> Group.

The LaunchScreen.xib file is not necessary and can be deleted if a launch screen picture is included in the asset catalog.

The two products which will be created in the project are HelloWorld.app and HelloWorldTests.xctest.

The project and targets are specified by the following parameter sets :

  • General
  • Capabilities
  • Info
  • Build Settings
  • Build Phases
  • Build Rules

You can add more targets to a project, for example a custom framework to factor common code into a single locus or an application extension (examples : content for the notification center, photo editing interfaces, …).

The anatomy of a Xcode project changed with Xcode 4.2.  Prior to this version, an iOS Xcode Project included the following items :

  • a folder xxx.xcodeproj
  • a file main.m
  • a file xxx_Prefix.ch
  • a file xxx-Info.plist
  • a file MainWindow.xib
  • a folder classes with .m and .h files
  • a folder resources with pictures and data

Build a project (Phases, Settings, Configurations)

To build a project is to compile its code and assemble the compiled code, together with various resources, into the app. To run a project is to launch the built app, in the Simulator or on a connected device. Running the app automatically build it first if necessary.

Click at the tab Build Phase at the top of the target editor window to view the stages by which the app is built. The build phases are both a report to the developer and a set of instructions to the Xcode IDE. If you change the build phases, you change the build process.

Xcode IDE Buildphases

Xcode IDE Build Phases

There are four build phases :

  • Target Dependencies
  • Compile Sources
  • Link Binary with Libraries
  • Copy Bundle Resources (copying doesn’t mean making an identical copy, certain types of files are treated in special ways)

You can add build phases with the menu Editor -> Add Build Phase, for example Add Run Script Build Phase to run a custom shell script late in the build process. To check the option “Show environment variables in build log” is then a useful choice.

Another aspect of how a target knows how to build the app is Build Settings. By clciking the tab Build Settings at the top of the target editor window you will find a long list of settings :

  • Architectures
  • Build Locations
  • Build Options
  • Code Signing
  • Deployment
  • Kernel Module
  • Linking
  • Packaging
  • Search Paths
  • Testing
  • Versioning
  • Apple LLVM 6.1 (Code generation, Custom Compiler Flags, Language, Language – C++, Language – Modules, Language – Objective C, Preprocessing, Warning Policies, Warnings – All languages, Warnings – C++, Warnings – Objective C, Warnings – Objective C and ARC)
  • Asset Catalog Compiler – Options
  • Interface Builder Storyboard Compiler – Options
  • Interface Builder XIB Compiler – Options
  • OSACompile – Build Options
  • Static Analyzer (Analysis Policy, Generic Issues, Issues – Objective C, Issues – Security)
  • User Defined
xcode_buildsettings

Xcode IDE Build Settings

You can define what and how build settings are displayed by selecting Basic / All or Combined / Levels. There are multiple lists of build setting values, though only one such list applies when a particular build is performed. Each list is called a configuration. By default there are two configurations :

  • Debug : used through the development process
  • Release : used for late-stage and performance testing, before submitting the app to the Apple App Store
Xcode IDE Configurations

Xcode IDE Configurations

The defaults of the build settings are usually acceptable and need not be changed.

Schemes and Destinations

xcode_schemeTo tell the Xcode IDE which configuration to use during a particular build is determined by a scheme. A scheme unites a target with a build configuration. A new project commes by default with a single scheme named after the project.

The menu Product -> Scheme -> Edit Scheme shows the current scheme. You can also show the scheme by using the scheme pop-up menu in the Project Window Toolbar.

A scheme has the following actions :

  • Build
  • Run
  • Test
  • Profile
  • Analyze
  • Archive
Xcode IDE destinations

Xcode IDE destinations

Hierarchically appended to each scheme listed in the scheme pop-up menu are the destinations. A destination is a machine that can run the app. This can be a real device connected to the Mac computer running the Xcode IDE or a virtual device simulated in the Simulator.

The menu Window -> Devices allow you to download and install additionaliOS SDK’s and to manage the devices to appear as destinations in the scheme popu-up menu.

Xcode Flow and Hierarchy

Most of the work in an app is done by the UIApplicationMain function which is automatically called in the project’s main.m source file within an autorelease pool (memory management with Automatic Reference Counting : ARC) . An application object is created and a run loop is launched that delivers input events to the app.

The other files in an Xcode project do the following tasks :

  • AppDelegate class with AppDelegate.h (header) and AppDelegate.m (implementation) files. This class creates an AppDelegate object as an entry point to the application. When the AppDelegate object notices that the application is launched it creates the ViewController object.
  • ViewController class with ViewController.h and ViewController.m files. This class is responsible for controlling and managing a view. The generated ViewController object is responsible for setting up the view described in Main.storyboard and showing it on the screen to the user.
  • Main.storyboard file is the view that the ViewController class is managing. The Main.storyboard file is created and modified with a visual designer, called “Interface Builder”, in the Xcode IDE.

The implementation file AppDelegate.m contains skeletons of important predefined methods :

  • application didFinishLaunchingWithOptions
  • applicationWillResignActive
  • applicationDidEnterBackground
  • applicationWillEnterForeground
  • applicationDidBecomeActive
  • applicationWillTerminate

You can add additional code to these methods that you want to be executed when the methods are called.

Xcode Storyboard, Screen, Scene, Canvas, View

An Xcode storyboard is the visual representation of the app’s user interface, showing screens of content and the transitions between them. The background of the storyboard is the canvas.

In a single view application, the storyboard contains only one scene. The arrow that points to the left of the scene on the canvas is the storyboard entry point, which means that this scene is loaded first when the app starts. If you have more than one scene, you can zoom the canvas in and  out with the menu Editor -> Canvas -> Zoom to see the whole content.

Xcode scene entry point

Xcode scene entry point

To adapt the generic scene in the canvas to real devices in different orientations, you must create an adaptive interface with the AutoLayout engine. The four icons Align, Pin, Resolve Issues and Resize Behavior allow to fix and manage constraints to adapt the interface. Use the assistant editor in the preview mode with the wAny and hAny icons to check the adapted interface.

Xcode IDE auto

Xcode IDE AutoLayout engine tools

Xcode IDE Library with Interface Elements

Xcode IDE Interface Elements

The library of objects provides UI elements (views, buttons, text labels, gesture controllers, …) to arrange the user interface. The size of UI elements can be modified by dragging its resize handles (small white squares) on the canvas.

The UIScreen object represents the physical iPhone or iPad screen. The UIWindow object provides drawing support for the screen. UIView objects are attached to the UIWindow object and draw their contents when directed to do so by the window. The visual interface of an app is essentially a set of UIView objects (views), which are themselves managed by UIViewController objects. UIViewController objects interact with views to determine what’s displayed by the views, handle any interactions with the user, and perform the logic of a program.

The views have following properties :

  • Views represent a single item (button, slider, image, text field, …) on the user interface, cover a specific area on the screen and can be configured to respond to touch events.
  • Views can be nested inside other views, which lead to the notion of a view hierarchy. They are referred as subviews (childs) or superviews (parents). Subviews are drawn and positioned relative to their superview. At the top is the window object.
  • Views are not the place where the bulk of a program logic resides.

UIKit views are grouped in the following general categories :

  • Content : image, text, …
  • Collections : group of views
  • Controls : buttons, sliders, switches, …
  • Bars : toolbar, navigation, tabs, …
  • Input : search, info, …
  • Containers : scroll, …
  • Modal : alerts, action sheets, …

The storyboard is not the only way to program an interface, you can also use .xib files or custom code. Storyboard and .xib files are converted to a bundle of NIB files at compilation.

Anatomy of an App

An app file is a special kind of folder called a package, and a special kind of package called a bundle. Use the Finder to locate the HelloWorld.app on the Mac computer. By default it is located inside a build folder in the user directory at Library/Developer/Xcode/DerivedData/.

Right clicking on the filename HelloWorld.app opens a pop-up menu with the submenu “Show Package Contents“. The following items are inside the bundle :

  • HelloWorld (exec)
  • PkgInfo
  • Info.plist
  • Folder Base.lproj including LaunchScreen.nib and Main.storyboardc

Bilingual Targets

It is legal for a target to be a bilingual target that contains both Objective-C and Swift code. To assure communication between the two languages, Xcode provides bridging headers.

Links

Additional informations about Xcode projects are available at the websites with the following links :

iOS Development

Last update : September 10, 2015

Operating System

iOS is the operating system that runs on iPad, iPhone and iPod devices. The operating system manages the device hardware and provides the technologies required to implement native apps.

The iOS architecture is layered in four levels :

  • Cocoa Touch (at the top level)
  • Media
  • Core services
  • Core OS (at the bottom level)

At the highest level, iOS acts as an intermediate between the underlying hardware and the apps.

iOS Development

To develop applications for iOS devices the following tools are required :

  • Mac computer running OS X 10.9.4 or later
  • Xcode
  • iOS SDK
  • Apple Developer Program
  • Documentation

Xcode

Xcode is an integrated development environment (IDE) created by Apple for developing software for OS X and iOS devices. First released in 2003, the latest stable release is version 6.4 and is available via the Mac App Store free of charge for OS X Yosemite users. Xcode 7 is available as beta version. Xcode is installed into the /Applications directory.

welcome

Welcome Screen of the Xcode IDE

Xcode is used in two ways. It is not only the name for the suite of developer tools, but also the name of an application within that suite : an Xcode project.

The program code to develop an iOS App is embedded in an Xcode project which is saved itself in an Xcode workspace. A workspace is a container that encompasses multiple projects that share common resources.

The result of the development process is called product, specified by an Xcode target. Projects can contain one or more targets, each of which produces one product.  Usually there are 2 targets : a release version  and tests. The targets can also be a plain and light version of an app for the Apple Store.

An Xcode scheme defines a collection of targets to build, a configuration to use when building, and a collection of tests to execute. Introduced with Xcode 4, schemes are a powerful way to control how you build, debug, test, analyze, profile, and deploy your application. You can have as many schemes as you want, but only one can be active at a time. Xcode uses build settings to specify aspects of the build process followed to generate a product. A build setting is a variable that determines how build tasks are performed.

Another concept of Xcode are templates. Xcode includes the following built-in app templates for developing common types of iOS apps, such as games, with preconfigured interface and source code files :

  • Single View Application
  • Master-Detail Application
  • Page-Based Application
  • Tabbed Application
  • Game
  • Cocoa Touch Framework
  • Cocoa Touch Static Library
  • In-App Purchase
  • Empty

Since Snow Leopard (Xcode 3.2), Xcode uses the LLVM compiler, based on the open source LLVM.org project. The LLVM compiler uses the Clang front end to parse source code and turn it into an interim format.

Xcode supports the following program files :

  • xxx.h : headers (interfaces)
  • xxx.m : Objective-C implementations
  • xxx.mm : Objective-C++ implementations
  • xxx.swift : Swift implementations
  • xxx.c :  C implementations
  • xxx.cpp : C++ implementations
xcode

Xcode Editor, Storybuilder and Utilities windows

The developed app can be tested on a real iOS device or in the iOS simulator integrated in Xcode.

iOS SDK

The iOS SDK is a software development kit developed by Apple and released in February 2008 to develop native applications for iOS devices. The initial name was iPhone SDK.

The main features of the iOS SDK are :

  • Objective-C
  • Swift
  • Cocoa Touch
  • Frameworks

Apple Developer Program

To get everything you need to develop and distribute your iOS apps, you must enroll in an Apple Developer Program. The iOS Developer Program costs 99$ per year. After registration, payment and acceptance by Apple, I did my first login to the Apple Developer Member Center with my Apple ID on April 24, 2015.

Apple Developer Member Center Login Window

Apple Developer Member Center Login Window

id

Registrated iOS Development Agent

ios_id

Signing Identities for iOS Development and Distribution

iOS Apps must be signed by the developer to deploy them on an iOS device. To test a signed app on a real device, you must registrate this device. You can registrate up to 100 iOS devices in your development account. The devices must be connected to the Mac computer running Xcode to test and install an app.

To register an iOS device for development, you must know its UDID (Unique Device Identifier). There are several possibilities to get this UDID when the device is connected to the Mac Computer :

  • with iTunes running on a Mac
  • with the Mac USB Inspector
  • with Xcode running on a Mac
usb

Get the iOS UDID with the USB Inspector on a Mac Computer

iphone

Get the iOS UDID in Xcode

Without connection an iOS device to a Mac computer, it’s possible to access the webpage get.udid.io to reveal it.

I registrated my iPhone 5 and my iPad 2 in my iOS development account.

register

Registration of an iPad and an iPhone in the iOS Development Program

 

 

add

Review and confirmation of the provided informations

 

After reviewing and confirming the informations, the registration was complete.

complete

Completed Registration

 

Documentation

The iOS Developer Library is the most important documentation resource to help developers. The library contains API references, programming guides, release notes, technical notes, sample code and other documents.

The tutorial Start Developing iOS Apps Today is he perfect starting point for creating apps that run on iPad, iPhone, and iPod touch.

Cocoa and Cocoa Touch

Cocoa is Apple’s native object-oriented application programming interface (API) for the OS X operating system. Cocoa Touch is the API for iOS. Cocoa and Cocoa Touch follows a Model-View-Controller (MVC) software architecture.

Objective-C

Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. It is the main programming language used by Apple for the OS X and iOS operating systems. Objective-C was originally developed in the early 1980s. It was selected as the main language used by NeXT for its NeXTSTEP operating system, from which OS X and iOS are derived.

Swift

Swift is a multi-paradigm, compiled programming language launched by Apple in 2014 for iOS and OS X development. Swift is intended to be more resilient to erroneous code than Objective-C, and also more concise. It uses the Objective-C runtime, allowing C, Objective-C, C++ and Swift code to run within a single program.

Frameworks

The system interfaces to the iOS Technologies are delivered by Apple in the iOS SDK in special packages called Frameworks. These are directories that contain shared libraries and the resources needed to support that library. The main iOS Frameworks provided are the following :

Links

Additional useful informations about iOS development are provided at the websites referenced hereafter :

CMakeLists.txt

Last update : June 13, 2015

A CMakelists.txt file is the only thing needed by CMake to generate makefiles for many platforms and IDEs. CMake is a cross-platform free and open-source configurator for managing the build process of software using a compiler-independent method. It is designed to support directory hierarchies and applications that depend on multiple libraries. It is used in conjunction with native build environments such as make, Apple’s Xcode, and Microsoft Visual Studio. CMake has minimal dependencies, requiring only a C++ compiler on its own build system. It refers to the entire family of tools : CMake, CTest, CDash, and CPack.

CMake development began in the year 1999 in response to the need for a cross-platform build environment for the Insight Segmentation and Registration Toolkit (ITK). The project was funded by the United States National Library of Medicine (NLM) as part of the Visible Human Project. It was inspired by the Visualization Toolkit (VTK) at Kitware who maintains the CMake software. The latest version of CMake is  3.2.2 released on April 14, 2015.

The ability to build a directory tree outside the source tree is a key feature of CMake, ensuring that if a build directory is removed, the source file remains unaffected.

A CMake generator translates the platform-agnostic CMakeLists.txt file and produces a platform-dependent build system that you can use to compile the project. CMake can find dependencies, do conditional builds and even set different compile or link flags depending on the choice of different options by the user.

During a build with CMake there are three directories involved :

  • source dir
  • build dir
  • install dir

CMake source directory

The source directory is where the project’s sources are stored. This is the directory to which you extract the project source archive or to where you checked out the sources from a version control system.  Typically a source is any file that is explicitly modified by the developer. The source directory also contains the files which describe the build to CMake : a bunch of CMakeLists.txt files in the source directory and probably in several sub-directories. Eventually additional CMake scripts with the extension .cmake are stored in a special CMake directory. By default CMake does not alter any file in the source directory and doesn’t add new ones.

CMake build directory

The build directory is where all compiler outputs are stored, which includes both object files as well as final executables and libraries. CMake also stores several files of its own there, including its cache. You can have more than one build directory for the same source directory with different build options. The concept to have a build directory separated from the source directory is called Out of source Build. Its much easier to convince yourself that your build has been cleaned since you can simply delete the build folder and start over again.

CMake install directory

CMake is able to copy all relevant files for the build project to an install directory.

CMake GUI

The most basic CMake frontend is the CMake console which is available on all platforms. CMake has also a GUI frontend called cmake-gui.

CMake build options

I mainly use CMake with Visual Studio C++ 2010 and with Xcode on Mac OSX Yosemite.

There are three Visual Studio 2010 generators :

  • Visual Studio 10 2010 : creates project files for the x86 (32bit) processor, running also on x64 processors
  • Visual Studio 10 2010 IA64 : creates project files for the x64 processor
  • Visual Studio 10 2010 Win64 : creates project files for the Itanium processor

Visual Studio 2010 itself is a 32bit program.

There is only one Xcode generator.

CMake Language Syntax

CMake has its own basic scripting language. A listfile may contain only commands and comments. Comments start with a # character and stop at the end of a line. The names of commands are case insensitive, the arguments are case sensitive. A list of all commands is available at the CMake website. The main commands are :

  • add_definitions
  • add_executable
  • add_library
  • cmake_minimum_required
  • find_library
  • include_directories
  • link_directories
  • message
  • project
  • set
  • set_property
  • source_group
  • target_link_libraries

The basic type in CMake is a string. The SET command is used to set the value of a variable to some string. It takes two arguments: the name of the variable and its new value. To access the value of a variable, you perform a substitution with the syntax ${variable name}.

Examples :

# create a list of strings
set (VAR a b c)
# use the system version of BOOST
set (USE_SYSTEM_BOOST ON)
# use CACHE to let user set a value in CMake GUI
set (USE_SYSTEM_SQLITE OFF CACHE BOOL "Don't use the system version")

Finding dependencies

CMake can automatically search for the location of dependency libraries to include in the project.

CMake Cross-Compiling

Cross-compiling is supported by CMake starting with version 2.6.0. Cross compiling means that the software is built for a different system than the one which does the build. This means

  • CMake cannot autodetect the target system
  • Usually the executables don’t run on the build host
  • The build process has to use a different set of include files and libraries for building, i.e. not the native ones

CMake Example

The following example shows the CMakeLists.txt file of the OrthancWebViewer 1.0 project. The following parameters are used :

  1. parameters of the build : set (VAR ON CACHE BOOL “some text”)
  2. parameters to fine-tune linking against system libraries  :
    set (VAR ON CACHE BOOL “some text”)
  3. distribution specific settings : set (VAR OFF CACHE BOOL “some text”)
  4. include resources and abcde.cmake files
  5. check the availability of the Orthanc SDK
  6. add specific definitions
  7. set (CORE_SOURCES …)
  8. add libraries and set target properties
  9. install targets
  10. add UnitTests

The following figures show the CMake GUI on Windows and Mac OSX handling the OrthancWebViewer project.

CMake

CMake version 3.0.1 on Windows 8.1

CMake

CMake version 3.2.3 on Mac OSX

On Mac OSX Cmake generated a CMakeCache.txt file in the Build folder and a CMakeOutput.log file in the Build/CMakeFiles folder.

CMake links

The following list provides links to websites with addition informations about Cmake :

Installation de Festival TTS sur Debian

Last update : April 9, 2015

Suite à l’installation du système Festival TTS sur mon MacBook Air il y a trois mois, je viens de l’installer sur mon laptop avec système d’exploitation Debian 7 Linux. Les différentes archives du système Festival ont été téléchargées et décomprimées avec ARC.

Décompression d'un archive Festival

Décompression d’une archive Festival

J’ai suivi ensuite la même procédure que sur Mac OSX, à savoir

  • création d’un répertoire Festival-TTS sur le desktop avec les sous-répertoires festival, speech_tools et festvox
  • compilation des programmes dans l’ordre speech_tools, festival, festvox
  • installation des voix et dictionnaires dans les sous-répertoires lib/voices et lib/dicts du répertoire festival
Configuration du

Configuration du programme speech_tools

La compilation du programme speech_tools s’est arrêtée avec les messages d’erreur

/usr/bin/ld: cannot find -lcurses
/usr/bin/ld: cannot find -lncurses

L’installation de la bibliothèque libncurses5-dev a réglé ce problème. La suite de la compilation s’est passée sans autres erreurs, abstraction faite de plusieurs avertissements concernant des variables spécifiées, mais non utilisées .

Il a été possible de démarrer le programme Festival avec la commande

/Desktop/Festival-TTS/festival/bin/festival

mais la synthèse d’une phrase de test

festival> (SayText "Hello, how are you")

a produit l’erreur

Linux: can't open /dev/dsp

Parmi les remèdes trouvés sur le net, j’ai opté pour la solution

apt-get install oss-compat
modprobe snd-pcm-oss

qui a été couronnée de succès.

Il ne restait plus que la configuration des différents chemins d’accès pour mettre le système Festival tout à fait opérationnel. Les commandes suivantes ont été ajoutées au script ~/.bashrc :

FESTIVALDIR="/home/mbarnig/Desktop/Festival-TTS/festival"
FESTVOXDIR="/home/mbarnig/Desktop/Festival-TTS/festvox"
ESTDIR="/home/mbarnig/Desktop/Festival-TTS/speech_tools"
PATH="$FESTIVALDIR/bin:$PATH"
PATH="$ESTDIR/bin:$PATH"
export PATH
export FESTIVALDIR
export FESTVOXDIR
export ESTDIR
Pat

Variables d’environnement et PATH du système Festival

Ca marche!

Lancement

Lancement du programme Festival TTS sur Debian Linux Wheezy

Le chargement respectivement la compilation d’une nouvelle voix, comme le luxembourgeois, ne réussit que si la voix anglaise kal_diphon est présente, si non une erreur “unbound variable rfs_info” se produit.

Character encoding for Festival TTS files

Today, the dominant character encoding for the World Wide Web and for text files is UTF-8 (Universal Character Set + Transformation Format 8 bits). UTF-8 uses one to 4 bytes to encode one symbol (1,112,064 valid code points in the Unicode code space). The first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII, making valid ASCII text valid UTF-8-encoded Unicode as well.

The Festival TTS package doesn’t  support UTF-8. The development of Festival started about 20 years ago when UTF-8 was only known by a few people of the Open Group for Unix Systems. Festival only supports one byte character encoding. All files created with the Festvox tools to develop new languages or voices for Festival are in US-ASCII format.

Problems appear if we need to use non-ASCII characters in Festival, for example the characters é è ë à ä ö ü for the luxembourgish language. In UTF-8 these characters are encoded with two bytes (16 bits), which yields errors in Festival. There exist however a series of 8-bit character encoding standards defined by ECMA, IEC and ISO. These standards are known as ISO-8859-x, where x is one of 15 parts. The listed luxembourgish specific characters are included in the ISO-8859 parts 1, 2, 3, 4, 9, 10, 14, 15 and 16. The  preferred ISO-8859 standard for the luxembourgish language is part 15, which includes the euro sign and provides the coverage of all the french and german letters. ISO 8859-15 encodes what it refers to as Latin alphabet no. 9.

Most text editors and other tools used today to write scripts and program code use UTF-8 as default format. On Windows I use Notepad++ to edit my files. Changing the encoding is easy. On my Mac OSX 10.10.2 (Yosemite) I use TextEdit, Terminal and Xcode to edit my files for Festival TTS. I changed the following preferences to encode my scripts and programs in ISO-8859-15.

TextEdit

I first checked the list of the encoding formats to show in the selection menu of the preference window.

List of character encoding formats in TextEdit

List of character encoding formats in TextEdit

The Occidental (ISO Latin 9) format corresponds to the ISO-8859-15 standard.

TextEdit

Preferences selected in TextEdit

Terminal (bash)

Same procedure for the OSX terminal. I checked the list of the encoding formats to show in the selection menu of the preference window.

List of character encoding formats in the OSX Terminal

List of character encoding formats in the OSX Terminal

In the OSX terminal preference window I also selected the Occidental (ISO Latin 9) format corresponding to the ISO-8859-15 standard.

Terminal

Preferences selected for the OSX Terminal

XCode 6.2

An ISO-8859-15 encoded text file (for example created witt TextEdit) is not recognized as such by Xcode 6.2. Characters as “é à è ö ä ü” are displayed as “È ‡ Ë ˆ ‰ ¸ “. The indicated Xcode decoding is Western (Mac OS Roman), even if the default text encoding set in Preferences > Text Editing is set to Western (ISO Latin 9). The characters are displayed as expected if the text is reinterpreted with the File Inspector to ISO Latin 9. Files created with Xcode are encoded always in UTF-8, the default text encoding setting seems to take no effect.

The Xcode window to select potential encoding formats uses a different presentation, but displays the same formats as those in TextEdit or OSX Terminal.

Xcode

List of encoding formats in Xcode

Again I defined the default text encoding as Occidental (ISO Latin 9) alias ISO-8859-15.

Xcode

Preferences selected for Xcode

File conversion

To check the text encoding of a file we can use the command file

mbarnig$ file -I name.ext

Some examples are shown hereafter :

file

Show file encoding with the command $ file -I

To convert an UTF-8 file in the ISO-8859-15 format we can use the command iconv

mbarnig$ iconv -f utf8 -t ISO-8859-1 utf8.txt > iso.txt

An example is shown hereafter :

iconv

File format conversion with the command $ iconv

The encoding formats available in iconv are listed with the option -l or –list :

iconv

Encoding formats available in iconv

Environment Variables

To show the environment variables we can use the command locale :

mbarnig$ locale
locale

Environment variables in OSX locale

LANG=                         ; native language + local customization if no LC_ variables
LC_COLLATE=”C”       ; character collation
LC_CTYPE=”C”           ; character handling functions
LC_MESSAGES=”C”   ; affirmative and negative responses
LC_MONETARY=”C”   ; monetary related numeric formatting
LC_NUMERIC=”C”      ; numeric formatting
LC_TIME=”C”              ; date and timing formatting
LC_ALL=                     ; value to overwrite all LC_ variables

“C” stands for simplest locale (character set ASCII, single byte character encoding, language US-english, …)

To show all the available locales we use the option -a

mbarnig$ locale -a
locale -a

List of locales defined in OSX 10.10.2

To change the locales we use the export function :

mbarnig$ export LANG=fr_CH
mbarnig$ export LC_ALL=fr_CH.ISO8859-15
locate_iso

Modification of locales in OSX 10.10.2 with the export command

The export functions can be included in the /Users/mbarnig/.bash_profile file.

Links

Speech Utterance

Last update : April 2, 2015

Utterance Definition

In linguistics an utterance is a unit of speech, without having a precise definition. It’s a bit of spoken language. It could be anything from “Baf!” to a full sentence or a long speech. The corresponding unit in written language is text.

Phonetically an utterance is a unit of speech bounded (preceded and followed) by silence. Phonemes, phones, morphemes,  words etc are all considered items of an utterance.

In orthography, an utterance begins with a capital letter and ends in a period, question mark, or exclamation point.

In Speech Synthesis (TTS) the text that you wish to be spoken is contained within an utterance object (example : SpeechSynthesisUtterance). The Festival TTS system uses the utterance as the basic object for synthesis. Speech synthesis is the process that applies a set of programs to an utterance.

The main stages to convert textual input to speech output are :

  1. Conversion of the input text to tokens
  2. Conversion of tokens to words
  3. Conversion of words to strings of phonemes
  4. Addition of prosodic information
  5. Generation of a waveform

In Festival each stage is executed in several steps. The number of steps and what actually happens may vary and is dependent on the particular language and voice selected. Each of the steps is achieved by a Festival module which will typically add new information to the utterance structure. Swapping of modules is possible.

Festival provides six synthesizer modules :

  • 2 diphone engines : MBROLA and diphone
  • 2 unit selection engines : clunits and multisyn
  • 2 HMM engines : clustergen and HTS

Festival Utterance Architecture

A very simple utterance architecture is the string model where the high level items are replaced sequentially by lower level items, from tokens to phones. The disadvantage of this architecture is the loss of information about higher levels.

Another architecture is the multi-level table model with one hierarchy. The problem is that there are no explicit connections between levels.

Festival uses a Heterogeneous Relation Graph (HRG). This model is defined as follows :

  • Utterances consist of a set of items, representing things like tokens, words, phones,
  • Each item is related by one or more relations to other items.
  • Each item contains a set of features, having each a name and a value.
  • Relations define lists, trees or lattices of items.

The stages and steps to build an utterance in Festival, described in the following chapters, are related to the us-english language and to the clustergen voice cmu_us_slt_cg.

To explore the architecture (structure) of an utterance in Festival, I will analyse the relation-trees created by the synthesis of the text string “253”.

festival> (voice_cmu_us_slt_cg)
cmu_us_slt_cg
festival> (set! utter (SayText "253"))
#<Utterance 0x104c20720>
festival> (utt.relationnames utter)
(Token
 Word
 Phrase
 Syllable
 Segment
 SylStructure
 IntEvent
 Intonation
 Target
 HMMstate
 segstate
 mcep
 mcep_link
 Wave)
festival> (utt.relation_tree utter 'Token)
((("253"
   ((id "_1")
    (name "253")
    (whitespace "")
    (prepunctuation "")
    (token_pos "cardinal")))
  (("two"
    ((id "_2")
     (name "two")
     (pos_index 1)
     (pos_index_score 0)
     (pos "cd")
     (phr_pos "cd")
     (phrase_score -0.69302821)
     (pbreak_index 1)
     (pbreak_index_score 0)
     (pbreak "NB"))))
  (("hundred"
    ((id "_3")
     (name "hundred")
     (pos_index 1)
     (pos_index_score 0)
     (pos "cd")
     (phr_pos "cd")
     (phrase_score -0.692711)
     (pbreak_index 1)
     (pbreak_index_score 0)
     (pbreak "NB"))))
  (("fifty"
    ((id "_4")
     (name "fifty")
     (pos_index 8)
     (pos_index_score 0)
     (pos "nn")
     (phr_pos "n")
     (phrase_score -0.69282991)
     (pbreak_index 1)
     (pbreak_index_score 0)
     (pbreak "NB"))))
  (("three"
    ((id "_5")
     (name "three")
     (pos_index 1)
     (pos_index_score 0)
     (pos "cd")
     (phr_pos "cd")
     (pbreak_index 0)
     (pbreak_index_score 0)
     (pbreak "B")
     (blevel 3))))))
festival> (utt.relation_tree utter 'Word)
((("two"
   ((id "_2")
    (name "two")
    (pos_index 1)
    (pos_index_score 0)
    (pos "cd")
    (phr_pos "cd")
    (phrase_score -0.69302821)
    (pbreak_index 1)
    (pbreak_index_score 0)
    (pbreak "NB"))))
 (("hundred"
   ...
   ...
    (blevel 3)))))
festival> (utt.relation_tree utter 'Phrase)
((("B" ((id "_6") (name "B")))
  (("two"
    ((id "_2")
     (name "two")
     (pos_index 1)
     (pos_index_score 0)
     (pos "cd")
     (phr_pos "cd")
     (phrase_score -0.69302821)
     (pbreak_index 1)
     (pbreak_index_score 0)
     (pbreak "NB"))))
  (("hundred"
    ...
    ...
     (blevel 3))))))
festival> (utt.relation_tree utter 'Syllable)
((("syl" ((id "_7") (name "syl") (stress 1))))
 (("syl" ((id "_10") (name "syl") (stress 1))))
 (("syl" ((id "_14") (name "syl") (stress 0))))
 (("syl" ((id "_19") (name "syl") (stress 1))))
 (("syl" ((id "_23") (name "syl") (stress 0))))
 (("syl" ((id "_26") (name "syl") (stress 1)))))
festival> (utt.relation_tree utter 'Segment)
((("pau" ((id "_30") (name "pau") (end 0.15000001))))
 (("t" ((id "_8") (name "t") (end 0.25016451))))
 (("uw" ((id "_9") (name "uw") (end 0.32980475))))
 (("hh" ((id "_11") (name "hh") (end 0.39506164))))
 (("ah" ((id "_12") (name "ah") (end 0.48999402))))
 (("n" ((id "_13") (name "n") (end 0.56175226))))
 (("d" ((id "_15") (name "d") (end 0.59711802))))
 (("r" ((id "_16") (name "r") (end 0.65382934))))
 (("ax" ((id "_17") (name "ax") (end 0.67743915))))
 (("d" ((id "_18") (name "d") (end 0.75765681))))
 (("f" ((id "_20") (name "f") (end 0.86216313))))
 (("ih" ((id "_21") (name "ih") (end 0.93317086))))
 (("f" ((id "_22") (name "f") (end 1.0023116))))
 (("t" ((id "_24") (name "t") (end 1.0642071))))
 (("iy" ((id "_25") (name "iy") (end 1.1534019))))
 (("th" ((id "_27") (name "th") (end 1.2816957))))
 (("r" ((id "_28") (name "r") (end 1.3449684))))
 (("iy" ((id "_29") (name "iy") (end 1.5254952))))
 (("pau" ((id "_31") (name "pau") (end 1.6754951)))))
festival> (utt.relation_tree utter 'SylStructure)
((("two"
   ((id "_2")
    (name "two")
    (pos_index 1)
    (pos_index_score 0)
    (pos "cd")
    (phr_pos "cd")
    (phrase_score -0.69302821)
    (pbreak_index 1)
    (pbreak_index_score 0)
    (pbreak "NB")))
  (("syl" ((id "_7") (name "syl") (stress 1)))
   (("t" ((id "_8") (name "t") (end 0.25016451))))
   (("uw" ((id "_9") (name "uw") (end 0.32980475))))))
 (("hundred"
   ((id "_3")
    (name "hundred")
    (pos_index 1)
    (pos_index_score 0)
    (pos "cd")
    (phr_pos "cd")
    (phrase_score -0.692711)
    (pbreak_index 1)
    (pbreak_index_score 0)
    (pbreak "NB")))
  (("syl" ((id "_10") (name "syl") (stress 1)))
   (("hh" ((id "_11") (name "hh") (end 0.39506164))))
   (("ah" ((id "_12") (name "ah") (end 0.48999402))))
   (("n" ((id "_13") (name "n") (end 0.56175226)))))
  (("syl" ((id "_14") (name "syl") (stress 0)))
   (("d" ((id "_15") (name "d") (end 0.59711802))))
   (("r" ((id "_16") (name "r") (end 0.65382934))))
   (("ax" ((id "_17") (name "ax") (end 0.67743915))))
   (("d" ((id "_18") (name "d") (end 0.75765681))))))
 (("fifty"
   ((id "_4")
    (name "fifty")
    (pos_index 8)
    (pos_index_score 0)
    (pos "nn")
    (phr_pos "n")
    (phrase_score -0.69282991)
    (pbreak_index 1)
    (pbreak_index_score 0)
    (pbreak "NB")))
  (("syl" ((id "_19") (name "syl") (stress 1)))
   (("f" ((id "_20") (name "f") (end 0.86216313))))
   (("ih" ((id "_21") (name "ih") (end 0.93317086))))
   (("f" ((id "_22") (name "f") (end 1.0023116)))))
  (("syl" ((id "_23") (name "syl") (stress 0)))
   (("t" ((id "_24") (name "t") (end 1.0642071))))
   (("iy" ((id "_25") (name "iy") (end 1.1534019))))))
 (("three"
   ((id "_5")
    (name "three")
    (pos_index 1)
    (pos_index_score 0)
    (pos "cd")
    (phr_pos "cd")
    (pbreak_index 0)
    (pbreak_index_score 0)
    (pbreak "B")
    (blevel 3)))
  (("syl" ((id "_26") (name "syl") (stress 1)))
   (("th" ((id "_27") (name "th") (end 1.2816957))))
   (("r" ((id "_28") (name "r") (end 1.3449684))))
   (("iy" ((id "_29") (name "iy") (end 1.5254952)))))))
festival> (utt.relation_tree utter 'IntEvent)
((("L-L%" ((id "_32") (name "L-L%"))))
 (("H*" ((id "_33") (name "H*"))))
 (("H*" ((id "_34") (name "H*"))))
 (("H*" ((id "_35") (name "H*")))))
festival> (utt.relation_tree utter 'Intonation)
((("syl" ((id "_26") (name "syl") (stress 1)))
  (("L-L%" ((id "_32") (name "L-L%")))))
 (("syl" ((id "_7") (name "syl") (stress 1)))
  (("H*" ((id "_33") (name "H*")))))
 (("syl" ((id "_10") (name "syl") (stress 1)))
  (("H*" ((id "_34") (name "H*")))))
 (("syl" ((id "_19") (name "syl") (stress 1)))
  (("H*" ((id "_35") (name "H*"))))))
festival> (utt.relation_tree utter 'Target)
((("t" ((id "_8") (name "t") (end 0.25016451)))
  (("0" ((id "_36") (f0 101.42016) (pos 0.1)))))
 (("uw" ((id "_9") (name "uw") (end 0.32980475)))
  (("0" ((id "_37") (f0 121.11904) (pos 0.25)))))
 (("hh" ((id "_11") (name "hh") (end 0.39506164)))
  (("0" ((id "_38") (f0 119.19957) (pos 0.30000001)))))
 (("ah" ((id "_12") (name "ah") (end 0.48999402)))
  (("0" ((id "_39") (f0 123.81679) (pos 0.44999999)))))
 (("d" ((id "_15") (name "d") (end 0.59711802)))
  (("0" ((id "_40") (f0 117.02986) (pos 0.60000002)))))
 (("ax" ((id "_17") (name "ax") (end 0.67743915)))
  (("0" ((id "_41") (f0 110.17942) (pos 0.85000008)))))
 (("f" ((id "_20") (name "f") (end 0.86216313)))
  (("0" ((id "_42") (f0 108.59299) (pos 1.0000001)))))
 (("ih" ((id "_21") (name "ih") (end 0.93317086)))
  (("0" ((id "_43") (f0 115.24371) (pos 1.1500001)))))
 (("t" ((id "_24") (name "t") (end 1.0642071)))
  (("0" ((id "_44") (f0 108.76601) (pos 1.3000002)))))
 (("iy" ((id "_25") (name "iy") (end 1.1534019)))
  (("0" ((id "_45") (f0 102.23844) (pos 1.4500003)))))
 (("th" ((id "_27") (name "th") (end 1.2816957)))
  (("0" ((id "_46") (f0 99.160072) (pos 1.5000002)))))
 (("iy" ((id "_29") (name "iy") (end 1.5254952)))
  (("0" ((id "_47") (f0 90.843689) (pos 1.7500002))))
  (("0" ((id "_48") (f0 88.125809) (pos 1.8000003))))))
festival> (utt.relation_tree utter 'HMMstate)
((("pau_1" ((id "_49") (name "pau_1") (statepos 1) (end 0.050000001)*
 (("pau_2" ((id "_50") (name "pau_2") (statepos 2) (end 0.1))))
 (("pau_3" ((id "_51") (name "pau_3") (statepos 3) (end 0.15000001)*
 (("t_1" ((id "_52") (name "t_1") (statepos 1) (end 0.16712391))))
 (("t_2" ((id "_53") (name "t_2") (statepos 2) (end 0.23217295))))
 (("t_3" ((id "_54") (name "t_3") (statepos 3) (end 0.25016451))))
 (("uw_1" ((id "_55") (name "uw_1") (statepos 1) (end 0.2764155))))
 (("uw_2" ((id "_56") (name "uw_2") (statepos 2) (end 0.3001706))))
 (("uw_3" ((id "_57") (name "uw_3") (statepos 3) (end 0.32980475))))
 (("hh_1" ((id "_58") (name "hh_1") (statepos 1) (end 0.3502973))))
 ...
 ...
 (("iy_1" ((id "_100") (name "iy_1") (statepos 1) (end 1.3995106))))
 (("iy_2" ((id "_101") (name "iy_2") (statepos 2) (end 1.4488922))))
 (("iy_3" ((id "_102") (name "iy_3") (statepos 3) (end 1.5254952))))
 (("pau_1" ((id "_103") (name "pau_1") (statepos 1) (end 1.5754951)*
 (("pau_2" ((id "_104") (name "pau_2") (statepos 2) (end 1.6254952)*
 (("pau_3" ((id "_105") (name "pau_3") (statepos 3) (end 1.6754951)*
festival> (utt.relation_tree utter 'segstate)
((("pau" ((id "_30") (name "pau") (end 0.15000001)))
  (("pau_1" ((id "_49") (name "pau_1") (statepos 1) (end 0.050000001)
  (("pau_2" ((id "_50") (name "pau_2") (statepos 2) (end 0.1))))
  (("pau_3" ((id "_51") (name "pau_3") (statepos 3) (end 0.15000001)*
 (("t" ((id "_8") (name "t") (end 0.25016451)))
  (("t_1" ((id "_52") (name "t_1") (statepos 1) (end 0.16712391))))
  (("t_2" ((id "_53") (name "t_2") (statepos 2) (end 0.23217295))))
  (("t_3" ((id "_54") (name "t_3") (statepos 3) (end 0.25016451)))))
 (("uw" ((id "_9") (name "uw") (end 0.32980475)))
  (("uw_1" ((id "_55") (name "uw_1") (statepos 1) (end 0.2764155))))
  (("uw_2" ((id "_56") (name "uw_2") (statepos 2) (end 0.3001706))))
  (("uw_3" ((id "_57") (name "uw_3") (statepos 3) (end 0.32980475))))
 ...
 ...
 (("iy" ((id "_29") (name "iy") (end 1.5254952)))
  (("iy_1" ((id "_100") (name "iy_1") (statepos 1) (end 1.3995106))))
  (("iy_2" ((id "_101") (name "iy_2") (statepos 2) (end 1.4488922))))
  (("iy_3" ((id "_102") (name "iy_3") (statepos 3) (end 1.5254952))*
 (("pau" ((id "_31") (name "pau") (end 1.6754951)))
  (("pau_1" ((id "_103") (name "pau_1") (statepos 1) (end 1.5754951)*
  (("pau_2" ((id "_104") (name "pau_2") (statepos 2) (end 1.6254952)*
  (("pau_3" ((id "_105") (name "pau_3") (statepos 3) (end 1.6754951)*
festival> (utt.relation_tree utter 'mcep)
((("pau_1"
   ((id "_106")
    (frame_number 0)
    (name "pau_1")
    (clustergen_param_frame 19315))))
 (("pau_1"
   ((id "_107")
    (frame_number 1)
    (name "pau_1")
    (clustergen_param_frame 19315))))
 (("pau_1"
   ((id "_108")
    (frame_number 2)
    (name "pau_1")
    (clustergen_param_frame 19315))))
 (("pau_1"
   ((id "_109")
    (frame_number 3)
    (name "pau_1")
    (clustergen_param_frame 19315))))
 ...
 ...
 (("t_1"
   ((id "_137")
    (frame_number 31)
    (name "t_1")
    (clustergen_param_frame 26089))))
 (("t_1"
   ((id "_138")
    (frame_number 32)
    (name "t_1")
    (clustergen_param_frame 26085))))
 (("t_1"
   ((id "_139")
    (frame_number 33)
    (name "t_1")
    (clustergen_param_frame 26085))))
 (("t_2"
   ((id "_140")
    (frame_number 34)
    (name "t_2")
    (clustergen_param_frame 26642))))
...
...
 (("uw_1"
   ((id "_157")
    (frame_number 51)
    (name "uw_1")
    (clustergen_param_frame 27595))))
 ...
 (("pau_3"
   ((id "_438")
    (frame_number 332)
    (name "pau_3")
    (clustergen_param_frame 22148))))
 (("pau_3"
   ((id "_439")
    (frame_number 333)
    (name "pau_3")
    (clustergen_param_frame 22148))))
 (("pau_3"
   ((id "_440")
    (frame_number 334)
    (name "pau_3")
    (clustergen_param_frame 22148))))
 (("pau_3"
   ((id "_441")
    (frame_number 335)
    (name "pau_3")
    (clustergen_param_frame 22365)))))
festival> (utt.relation_tree utter 'mcep_link)
((("pau_1" ((id "_49") (name "pau_1") (statepos 1) (end 0.050000001).
  (("pau_1"
    ((id "_106")
     (frame_number 0)
     (name "pau_1")
     (clustergen_param_frame 19315))))
  (("pau_1"
    ((id "_107")
     (frame_number 1)
     (name "pau_1")
     (clustergen_param_frame 19315))))
  (("pau_1"
    ((id "_108")
     (frame_number 2)
     (name "pau_1")
     (clustergen_param_frame 19315))))
  ...
  ...
  (("pau_3"
    ((id "_439")
     (frame_number 333)
     (name "pau_3")
     (clustergen_param_frame 22148))))
  (("pau_3"
    ((id "_440")
     (frame_number 334)
     (name "pau_3")
     (clustergen_param_frame 22148))))
  (("pau_3"
    ((id "_441")
     (frame_number 335)
     (name "pau_3")
     (clustergen_param_frame 22365))))))
festival> (utt.relation_tree utter 'Wave)
((("0" ((id "_442") (wave "[Val wave]")))))
festival>

Notes :
* some parentheses have been deleted in the display for formating reasons
… some content has been deleted to reduce the size of the analyzed code

Results of the code analysis

The number of items created for the string “253” are shown in the following table :

number item id’s
1 token 1
4 word 2-5
1 phrase 6
6 syllable 7, 10, 14, 19, 23, 26
19 segment 8-9, 11-13, 15-18, 20-22, 24-25, 27-31
4 intevent 32-35
13 target 36-48
57 hmmstate 49-105
336 mcep 106-441
1 wave 442

The features associated to the different items are presented in the next table :

item features
token name, whitespace, prepunctuation, token_pos
word name, pos_index, pos_index_score, pos, phr_pos, phrase_score, pbreak_index, pbreak_index_score, pbreak, blevel
phrase name
syllable name, stress
segment name, end
intevent name
target f0, pos
hmmstate name, statepos, end
mcep name, frame_number, clustergen_param_frame
wave Val

The last table shows the relations between the different items in the HRG :

item daughter leaf relation
token word x Token
word syllable SylStructure
phrase word x Phrase
syllable segment x (except silence) SylStructure
syllable intevent x Intonation
segment target x Target
segment hmmstate x segstate
segment mcep x mcep_link

Relations between utterance items

To better understand the relations between utterance items, I use a second example :

festival>
(set! utter (SayText "253 and 36"))
(utt.relation.print utter 'Token)
Tok

Festival SayText

There are 3 tokens. The Token relation is a list of trees where each root is the white space separated tokenized object from the input character string and where the daughters are the list of words associated with the tokens. Most often it is a one to one relationship, but in the case of digits a token is associated with several words. The following command shows the Token tree with the daughters :

(utt.relation_tree utter 'Token)
Festival Token_tree

Festival Token_tree

We can check that the word list corresponds to the Token tree list :

(utt.relation.print utter 'Word)
Festival Word List

Festival Word List

To access the second word of the first token we can use two methods :

(item.name (item.daughter2 (utt.relation.first utter 'Token)))

or

(item.name (item.next (utt.relation.first utter 'Word)))
tok

Festival access methods to word item

TTS stages and steps

In the next chapters the different stages and steps executed to synthesize a text string are described with more details. In the first step a simple and a complex utterance of type Text are created :

(set! simple_utt (Utterance Text 
"The quick brown fox jumps over the lazy dog"))
(set! complex_utt (Utterance Text
"Mr. James Brown Jr. attended flight No AA4101 to Boston on
Friday 02/13/2014."))

The complex utterance named complex_utt is used in the following examples.

1. Text-to-Token Conversion

Text

Text is a string of characters in ASCII or ISO-8850 format. Written (raw) text usually contains also numbers, names, abbreviations, symbols, punctuation etc which must be translated into spoken text. This process is called Tokenization. Other terms used are (lexical) Pre-Processing, Text Normalization or Canonicalization. To access the items and features of the defined utterance named complex_utt in Festival we use the following modules :

festival> (Initialize complex_utt) ; Initialize utterance
festival> (utt.relationnames complex_utt) ; show created relations
Festival Utterance Initialization

Festival Utterance Initialization

The result nil indicates that there exist not yet a relation inside the text-utterance.

Tokens

The second step is the Tokenization which consists in the conversion of the input text to tokens. A token is a sequence of characters where whitespace and punctuation are eliminated. The following Festival command is used to convert raw text to tokens and to show them :

festival> (Text complex_utt) ; convert text to tokens
festival> (utt.relationnames complex_utt) ; check new relations
festival> (utt.relation.print complex_utt 'Token) ; display tokens
Festival Text Module to convert raw text to tokens

Festival Text Module to convert raw text to tokens

There are several methods to access individual tokens :

festival> (utt.relation.first complex_utt 'Token) ; returns 1st token
festival> (utt.relation.last complex_utt 'Token) ; returns last token
festival> (utt.relation_tree complex_utt 'Token) ; returns token tree
Festival Token Access

Festival Token Access

This utt.relation_tree method can also be applied to other relations than ‘Tokens.

2. Token-to-Word Conversion

Words

In linguistics, a word is the smallest element that may be uttered in isolation with semantic or pragmatic content. To convert the isolated tokens to words, we use the Festival commands :

festival> (Token complex_utt) ; token to word conversion
festival> (utt.relationnames complex_utt) ; check new relations
festival> (utt.relation.print complex_utt 'Word) ; display words
Festival Token Module to convert tokens to words

Festival Token Module to convert tokens to words

The rules to perform the token to word conversion are specified in the Festival script token.scm.

POS

Part-of-Speech (POS) Tagging is related to the Token-to-Word conversion. POS is also called grammatical tagging or word-category disambiguation. It’s the process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition, as well as its context (identification of words as nouns, verbs, adjectives, adverbs, etc.)

To do the POS tagging, we use the commands

festival> (POS complex_utt) ; Part of Speech tagging
festival> (utt.relationnames complex_utt) ; check new relations
festival> (utt.relation.print complex_utt 'Word) ; display words

The relation check shows that no new relation was created with the POS method. There are however new features which have been added to the ‘Word relation.

Festival POS Module to tag the words

Festival POS Module to tag the words

The new features are :

  • pos_index n
  • pos_index_score m
  • pos xx

Phrase

The last step of the Token-to-Word conversion is the phrasing. This process determines the places where phrase boundaries should be inserted. Prosodic phrasing in TTS makes the whole speech more understandable. The phrasing is launched with the following commands :

festival> (Phrasify complex_utt) ;
festival> (utt.relationnames complex_utt) ; check new relations
festival> (utt.relation.print complex_utt 'Phrase) ; display breaks
Festival Phrasify Module to insert boundaries

Festival Phrasify Module to insert boundaries

The result can be seen in new attributes in the Word relation:

festival> (utt.relation.print complex_utt 'Word)
  • phr_pos xx
  • phrase_score nn
  • pbreak_index n
  • pbreak_index_score m
  • pbreak yy  (B for small breaks, BB is for big breaks, NB for no break)
  • blevel p
utt6a

Festival Word list after phrasing (click to enlarge)

3. Word-to-Phoneme Conversion

The command

festival> (Word complex_utt)

generates 3 new relations : syllables, segments and SylStructure.

utt7

Festival relations generated by the Word method

Segment

Segments and phones are synonyms.

festival> (utt.relation.print complex_utt 'Segment)
utt9

Festival segments = phones

Syllable

Consonants and vowels combine to make syllables. They are often considered the phonological building blocks of words, but there is no universally accepted definition for a syllable. An approximate definition is : a syllable is a vowel sound together with some of the surrounding consonants closely associated with it. The general structure of a syllable consists of three segments :

  • Onset : a consonant or consonant cluster
  • Nucleus : a sequence of vowels or syllabic consonant
  • Coda : a sequence of consonants

Nucleus and coda are grouped together as a Rime. Prominent syllables are called accented; they are louder, longer and have a different pitch.

The following Festival command shows the syllables of the defined utterance.

festival> (utt.relation.print complex_utt 'Syllable)
utt8

Festival syllables

SylStructure

Words, segments and syllables are related in the HRG trought the SylStructure. The command

festival> (utt.relation.print complex_utt 'SylStructure)

prints these related items.

utt10 (click to enlarge)

Festival SylStructure  (click to enlarge)

4. Prosodic Information Addition

Besides the phrasing with break indices, additional prosodic components can be added to speech synthesis to improve the voice quality. Some of these elements are :

  • pitch accents (stress)
  • final boundary tones
  • phrasal tones
  • F0 contour
  • tilt
  • duration

Festival supports ToBI, a framework for developing community-wide conventions for transcribing the intonation and prosodic structure of spoken utterances in a language variety.

The process

festival> (Intonation complex_utt)

generates two additional relations : IntEvent and Intonation

utt11

Festival prosodic relations

IntEvent

The command

festival> (utt.relation.print complex_utt 'IntEvent)

prints the IntEvent items.

utt12

Festival IntEvent items

The following types are listed :

  • L-L% : low boundary tone
  • H* : peak accent
  • !H* : downstep high
  • L+H* : bitonal accent, rising peak

Intonation

The command

festival> (utt.relation.print complex_utt 'Intonation)

prints the Intonation items.

utt13

Festival Intonation items

Only the syllables with stress are displayed.

Duration

The process

festival> (Duration complex_utt)

creates no new relations and I have not seen any new items or features in other relations.

utt14

utt14

Target

The last process in the prosodic stage

festival> (Int_Targets complex_utt)

generates the additional relation Target.

utt15

Festival relations after the Int_Targets process

The command

festival> (utt.relation.print complex_utt 'Target)

prints the target items.

Festival clustergen targets

Festival clustergen targets

The unique target features are the segment name and the segment end time.

5. Waveform Generation

Wave

The process

festival> (Wave_Synth complex_utt)
festival> (utt.relation.print complex_utt 'Wave)

generates five new relations :

  • HMMstate
  • segstate
  • mcep
  • mcep_links
  • Wave
Festival

Festival Wave relations for clustergen voice

In the next chapters we use the method

(utt.relation.print complex_utt 'Relation)

to display the relations and features specific to the diphone voice.

Relation ‘HMMstate

HMMstates for clustergen voice

HMMstates for Festival clustergen voice

Relation ‘segstate

segstates for

segstates for Festival clustergen voice

Relation ‘mcep

mcep

mcep features for Festival clustergen voice

Relation ‘mcep_links

mcep_link

mcep_links relation for Festival clustergen voice

Relation ‘Wave

wave

Wave relation for Festival clustergen voice

Diphone Voice Utterance

If we use a diphone voice (e.g. the default kal_diphone voice) instead of the clustergen voice, the last step of the prosodic stage (No 4) and the complete wave-synthesis stage (No 5) provide different relations and features.

We use the Festival method “SayText”, a combination of the above presented processes

  • Initialize utt
  • Text utt
  • Token utt
  • POS utt
  • Phrasify utt
  • Word utt
  • Intonation utt
  • Duration utt
  • Int_Targets utt
  • Wave_Synt utt

to create the same complex utterance as in the first example :

festival>
(set! complex_utt (SayText "Mr. James Brown Jr. attended flight 
No AA4101 to Boston on Friday 02/13/2014."))
(utt.relationnames complex_utt)

Here are the results :

clun

Utterance relations for a Festival diphone voice

In the next chapters we use the method

(utt.relation.print complex_utt 'Relation)

to display the relations and features specific to the diphone voice.

Relation ‘Target

utt16

Relation Target for diphone voice

Relation ‘Unit

utt18

Relation Unit for diphone voice (click to enlarge)

Relation ‘SourceCoef

utt19

Relation ‘SourceCoef for diphone voice

Relation ‘fo

utt20

Relation f0 for diphone voice

Relation ‘TargetCoef

utt21

Relation TargetCoef for diphone voice

Relation ‘US_map

utt22

Relation US_map for diphone voice

Relation ‘Wave

utt23

Relation Wave for diphone voice

Playing and saving the diphone voice utterance :

utt24

Playing and saving a synthesized Festival utterance

Diphone voice utterance shown in Audacity :

utt_wave

Display of a synthesized Festival utterance

Clunits Voice Utterance

What is true for the diphone voice is also ture for a clunits voice. The last step of the prosodic stage (No 4) and the complete wave-synthesis stage (No 5) generate different relations and features. As an example we use a swedish clunits voice :

clu1

Relations for Festival clunits voice

Relation ‘Target

clu4

Relation Target for Festival clunits voice  (click to enlarge)

Relation ‘Unit

clu3

Relation unit for Festival clunits voice  (click to enlarge)

Relation ‘SourceSegments

clu2

Relation SourceSegments for Festival clunits voice

That’s all.

Festival TTS Scheme

Last update : April 23, 2015

Scheme

The Scheme programmimg language is a dialect of Lisp designed to be more consistent. Originally specified in 1958, Lisp is the second-oldest high-level programming language;  Fortran is one year older. Lisp has a distinctive, fully parenthesized Polish Notation. The name LISP derives from LISt Processing. All program code in Lisp is written as s-expressions, or parenthesized lists. Lots of brackets comes to most people’s mind when they look at s-expressions. Lisp was invented by John McCarthy in 1958 while he was at the Massachusetts Institute of Technology (MIT). Lisp became quickly the favored programming language for artificial intelligence (AI) research. Another dialect of Lisp is Common Lisp (CL).

Scheme was invented by Guy Lewis Steele Jr. and Gerald Jay Sussman. The Scheme language is standardized in the official IEEE standard and in a de facto standard called the Revisedn Report on the Algorithmic Language Scheme (RnRS). The latest standard R7RS was finalized in 2013. An improper list of Scheme resources is available at the schemers.org website.

The Festival TTS system uses an enhanced version of Scheme, based on George Carrette’s SIOD (Scheme in one Defun).

In Scheme, an expression is an atom or a list. Atoms can be symbols, numbers, strings or special types like functions, arrays, hash tables. A list consist of a left parenthesis, a number of expressions and a right parenthesis. Comments are started by a semicolon and run until end of line. Scheme is case sensitive.

Each member of a list (except set!) is evaluated. The 1st item is treated as function and applied using the remainder of the lists as arguments to the function. Strings are bounded by the double-quote character.

"This is a string with embedded \"...\" quotes"

Double quotes in a string are escaped with a backslash ( \ ).

Scheme Core Functions

The core functions are :

  • (set! SYMBOL VALUE) ; attribute a value to a variable
  • (define (FUNCNAME ARG0 ARG1 …) . BODY) ; declare the boy of a function with arguments
  • (if TEST TRUECASE [FALSECASE] ) ; if the value of TEST is non-nil, return value of evaluated truecase-expression, otherwise, if present, return evaluated falsecase-expressionor nil
  • (cond (TEST0 . BODY) (TEST1 . BODY) …) ; multipe if statement
  • (begin . BODY) ; returns the value of the last s-expression in a list
  • (or . DISJ) ; return the value of the 1st non-nil disjunct
  • (and : CONJ) ; return the value of the 1st nil conjunct or the value of the last conjunct
  • (car EXPR) ; return the 1st item of a list or nil for an atom or empty list
  • (cdr EXPR) ; return the rest of a list or nil for an atom or empty list
  • (cons EXPR0 EXPR1) ; build a new list with car of EXPR0 and crd of EXPR1
  • (list . BODY) ; form a list from each of the arguments
  • (append . BODY) ; join each of the arguments into a single list
  • (nth N LIST) ; return Nth member of list (1st item is 0th member)
  • (nth_cdr N LIST) ; return Nth cdr list
  • (last LIST) ; return last crd of a list
  • (reverse list) ; return the list in inverse order
  • (member ITEM LIST) ; returns the cdr in LIST whose car is ITEM or nil if not found
  • (assoc ITEM ALIST) ; standardlist format for representing value pairs
  • (intern “abc”) ; convert a string to a symbol
  • (parse-number “3.14”) ; convert a string to a number
  • (mapcar fct list1 list2 …) ; returns a list which is the result of applying the function fcn to the elements of each of the lists specified
  • (number? x) ; returns true if x is a number
  • (symbol? x) ; returns true if x is a symbol
  • (quote x) ;  special form that returns x without evaluating it. Commonly written in abbreviated format as ‘x (apostrophe x = short hand for quote). The (quote something) form returns that something as itself no matter what the something is (symbol, list …). Self evaluated data like numbers and strings need not to be quoted.

If the 1st statement of a function is a string, it is treated as a documentation string. The string will be printed when help is requested for that function symbol. To request help, press the keys <Esc> and <h> simultaneously after entering the name of the function.

Help

Scheme Help Function

s-expressions

A Scheme s-expression is a construct that returns a value. Typical s-expressions are

3
( 1 2 3)
(a ( b c ) d)
(( a b ) ( d e ))

Some examples of simple arithmetic scheme s-expressions are presented hereafter :

(+ 2 3 4)   ; add the arguments
(set! a 3)  ; define the variable a to 3
(* a 5)     ; multiply the arguments
(/ 8 a)     ; divide the 1st argument by the 2nd argument
(- 12 a)    ; substract the 2nd argument from the 1st argument
(cons 1 2)  ; join 2 arguments as a pair

The following figure shows the results :

Scheme

Scheme arithmetic expressions

Here are some more examples of scheme s-expressions :

; define a list with 3 fruits
(set! fruits '(apples pears bananas))
; use the 1st item of the list
(car fruits) 
; use the remaining items following the 1st item of the list
(cdr fruits)
; define a list with 2 vegetables
(set! vegetables '(carrots beans)) 
; combine the fruit and vegetable lists
(append fruits vegetables) 
; get the number of items in the list
(length fruits)
; get the number of items in the combined list
(length (append fruits vegetables))
; join 2 lists as a pair
(cons vegetables fruits)

The next figure shows the results :

Scheme list expressions

Scheme list expressions

Regular expressions

A regular expression is a sequence of characters that forms a search pattern, mainly for use in string matching (filters). A regular expression is sometimes called a rational expression and abbreviated as regex or regexp. The concept was formalized in the 1950s by the American mathematician Stephen Kleene who created a regular language. Today regular expressions are so useful in computing that the various systems to specify regular expressions have evolved to provide both a basic and extended standard for the grammar and syntax.

Festival Scheme uses a regex implementation based on Henry Spencer‘s regex code. In general all characters match themselves, except for the following ones which have special interpretations (meta-characters), if they are not preceded by a backslash :

.  *  +  ?  [  ]  (  )  |  ^  $  \
  • .    matches any character in the regex
  • *    matches zero or more occurences of the preceding item in the regex
  • +   matches one or more occurences of the preceding item in the regex
  • ?   matches zero or one occurence of the preceding item in the regex
  • [  ]   defines a range of characters
  • (  )  defines a section (scope, block, capturing group)
  • |     or operator ; separates alternatives
  • ^   if specified first negates the class
  • $   matches the ending position of a string
  • \  backslash to escape meta-characters

Examples :

  • a.c   > abc, acc, adc
  • a*c   > c, aac, ac, aaaaaaaac
  • a+c  > ac, aaac
  • ab?c  > ac, abc
  • [a-z] any lower case letter
  • [a-zA-Z] any lower or upper case letter
  • gr(e|a)y  > grey, gray
  • ^abc  > any characters other than a, b or c

Boolean Values

Scheme provides a special unique object called false, whose written representation is #f and who is the result of a condition expression (if or cond). This is the only value that counts as false, all others count as true. For this reason there is no need to provide a boolean object true, but for clarity, Scheme provides an object written #t which can be used as true value. In Festival’s Scheme (SIOD), #f and #t are unbound variables. In SIOD the variable t stands for true and nil stands for false.

Named and unnamed (anonymous = lambda) functions

Here is an example of a new (named) function :

(define (add a b) (+ a b))
(add 7 5)
Scheme  functions

Scheme functions

Scheme provides a lot of useful string functions, for example

(string-equal ATOM1 ATOM2)
(string-append STR1 STR2)
(string-before STR SUBSTR)
(string-after STR SUBSTR)
(string-length SYMBOL)
(string-matches STR REGEX)
(Symbolexplode SYMBOL)
(member_string STR LIST)

A function that returns either true (t) or false (nil) is called a predicate.

Instead of a named function, we can create unnamed (anonymous) functions using the special lambda form.

(lambda (arg1 arg2 ...) form1 form2 ...)

This function is simply defined where it’s used; for example, to square the items of a list we can apply the following construct

(set! mylist '(1 3 6 8))
(mapcar (lambda (x) (* x x)) mylist)
Lambda

Scheme lambda example

Conditionals

Scheme provides a very powerful nested conditional expression with the syntax

( cond < clause 1 > <c lause 2 > < clause 3 > ... )

where < clause > is of the form

( < predicate > < form > ...)

The last clause may be an else clause which has the syntax

( < else > < form 1> < form 2 > ...)

Cond is a special form where each clause is processed until the predicate expression of the clause evaluates true. Then each subform in the predicate is evaluated with the value of the last one becoming the value of the cond form.

A simple example is :

(cond (( > a b ) 'greater)
      (( < a b ) 'less)
      (t 'equal))
cond

Scheme cond example

A more complex example is show hereafter :

(cond
((string-matches name "[1-9][0-9]+")
  (mbarnig_lb::number token name))
((string-matches name "[A-Z]+")
  (mbarnig_lb::upper_case_abbr token name))
((string-matches  name "[a-z]+")
  (mbarnig_lb::lower_case_abbr token name))
(t (list name))) ; when no specific rules apply do the general ones

Another special conditional form is

(and subform1 subform2 subform3 ...)

This form causes the evaluation of its subforms in order, from left to right, continuing if and only if the subform returns a non-null value.

System commands

Here are some examples of scheme system commands :

(quit) or (exit) ; close the program
(pwd) ; show the pathname of the current directory
(cd DIRECTORY)
(getenv NAME)
(setenv NAME VALUE)
(set_backtrace t) ; display a backtrace when a scheme error occurs
(unwind-protect …) ; catch errors and continue normally

Hooks

A hook in Scheme terms is a position within the program code where a user may specify his own customization. There a number of places in Festival where hooks are used. A hook variable contains either a function or a list of functions that are to be applied at some point in the processing. A list of defined hooks  in Festival is shown below :

  • after_analysis_hooks : functions applied after analysis, before synthesis
  • default_after_analysis_hooks : default functions applied after analysis
  • before_synth_hooks : functions applied on synthesized utterances
  • default_before_synth_hooks : default functions applied on synthesized utterances
  • after_synth_hooks : functions applied after synthesis to manipulate waveforms
  • default_after_synth_hooks : default functions applied after synthesis
  • diphone_module_hooks : functions applied at the start of the diphone module
  • tts_hooks : functions applied during text to speech
  • xxml_hooks : functions applied before tts_hooks
  • xxml_token_hooks : functions applied to each token
Festival hooks

Festival hooks

Links

A list with links to websites providing additional informations about Scheme is provided hereafter :