Introduction
hyper is a fast HTTP implementation written in and for Rust.
- A Client for talking to web services.
- A Server for building those web services.
- Blazing fast* thanks to Rust.
- High concurrency with non-blocking sockets.
- HTTP/1 and HTTP/2 support.
Binding a Tiny Server
In this section, we’ll create a Tiny Server from scratch. We’ll start with the necessary dependencies, declare a main function, and then try to build and run it.
Adding necessary dependencies
First, we need to create a new folder where we’ll add the necessary dependencies to create our first microservice. Use cargo to make a new project called hyper-microservice:
cargo new hyper-microservice
Open the created folder and add dependencies to your Cargo.toml file:
[dependencies]
hyper = "0.14"
Full Cargo.toml
[package]
name = "hyper-microservice"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
hyper = "0.12"
The single dependency is the hyper crate. The latest release of this crate is asynchronous and lies on top of the futures crate. It also uses the tokio crate for runtime, which includes the scheduler, reactor, and asynchronous sockets. Some of the necessary types of the tokio crate are re-exported in the hyper::rt module. The main purpose of hyper is to operate with the HTTP protocol, which means that the crate can support other runtimes in the future.
The main function of the server
Let’s start with the main function and add the necessary dependencies one by one, looking in detail at why we need each one. A minimal HTTP server needs the following:
- An address to bind to
- A server instance to handle incoming requests
- A default handler for any request
- A reactor (runtime) where the server instance will operate
Address of the server
The first thing we need is an address. A socket address consists of an IP address and a port number. We’ll use IPv4 because it’s widely supported.
The standard Rust library contains an IpAddr type to represent the IP address. We’ll use the SocketAddr struct, which contains both the IpAddr and the u16 for the port number. We can construct the SocketAddr from a tuple of the ([u8; 4], u16) type. Add the following code to our main function:
let addr = ([127, 0, 0, 1], 8080).into();
We used an implementation of the impl<I: Into<IpAddr>> From<(I, u16)> for SocketAddr trait here, which, in turn, uses impl From<[u8; 4]> for IpAddr. This lets us use the .into() method call to construct a socket address from the tuple. Similarly, we can create new SocketAddr instances with a constructor. In production applications, we will parse the socket addresses from external strings (command-line parameters or environment variables), and if no variants are set, we’ll create SocketAddr from a tuple with default values.
Server instances
Now we can create a server instance and bind to this address:
let builder = Server::bind(&addr);
The preceding line creates a hyper::server::Server instance with a bind constructor that actually returns Builder, not a Server instance. The Server struct implements the Future trait. It has similar role to Result, but describes a value that isn’t available immediately.
Setting the requests handler
The Builder struct provides methods to tweak the parameters of the server created. For example, hyper’s server supports both HTTP1 and HTTP2. You can use a builder value to choose either one protocol or both. In the following example, we’re using builder to attach a service for handling incoming HTTP requests using the serve method:
let server = builder.serve(|| {
service_fn_ok(|_| {
Response::new(Body::from("Almost microservice..."))
})
});
Here, we’re using the builder instance to attach a function that generates a Service instance. This function implements the hyper::service::NewService trait. The generated item then has to implement the hyper::service::Service trait. A service in a hyper crate is a function that takes a request and gives a response back. We haven’t implemented this trait in this example; instead, we’ll use the service_fn_ok function, which turns a function with suitable types into a service handler.
There are two corresponding structs: hyper::Request and hyper::Response. In the preceding code, we ignored a request argument and constructed the same response for every request. The response contains a body of static text.
Adding the server instance to a runtime
Since we now have a handler, we can start the server. The runtime expects a Future instance with the Future<Item = (), Error = ()> type, but the Server struct implements a Future with the hyper::Error error type. We can use this error to inform the user about issues, but in our example we’ll just drop any error. As you might remember, the drop function expects a single argument of any type and returns a unit empty type. The Future trait uses the map_err method. It changes the error type using a function, which expects the original error type and returns a new one. Drop an error from the server using the following:
let server = server.map_err(drop);
We now have everything we need and can start the server with the specific runtime. Use the hyper::rt::run function to start the server:
hyper::rt::run(server);
Don’t compile it yet, because we haven’t imported types. Add it to the head of a source file:
use hyper::{Body, Response, Server};
use hyper::rt::Future;
use hyper::service::service_fn_ok;
We need to import the different hyper types that we are using: Server, Response, and Body. In the final line, we’re using the service_fn_ok function. The Future import needs special attention; it’s the re-exported trait of the futures crate and it’s used everywhere in the hyper crate.
Full Example
use hyper::{Body, Response, Server};
use hyper::rt::Future;
use hyper::service::service_fn_ok;
fn main() {
let addr = ([127, 0, 0, 1], 8080).into();
let builder = Server::bind(&addr);
let server = builder.serve(|| {
service_fn_ok(|_| {
Response::new(Body::from("Rust Microservice"))
})
});
let server = server.map_err(drop);
hyper::rt::run(server);
}
Building and running
You can now compile the code and start the server with the following command:
cargo run
Use your browser to connect to the server. Enter http://localhost:8080/ in the browser’s address bar and the browser will connect to your server and show you a page with the text you entered in the previous code.
If you want to check Rebuilding on changes, Please check here.
Handling incoming requests
- Pre-required
We’ve created a server, but it isn’t very useful until it can respond to real requests. In this section, we’ll add handlers to the requests and use the principles of RESTful.
Adding a service function
In the previous section, we implemented simple services based on service_fn_ok functions, which expect the service function not to throw any errors. There are also service_fn functions, which can be used to create handlers that can return an error. These are more suitable for asynchronous Future results.
As we saw previously, the Future trait has two associated types: one for a successful result and one for an error. The service_fn function expects the result to be converted into future with the IntoFuture trait. You can read more about the futures crate and its types in the next chapter.
Let’s change the previous service function into one that returns the Future instance:
let server = builder.serve(|| service_fn(microservice_handler));
Then add this unimplemented service function:
fn microservice_handler(req: Request<Body>)
-> impl Future<Item=Response<Body>, Error=Error>
{
unimplemented!();
}
Similar to the previous one, this function expects a Request, but it doesn’t return a simple Response instance. Instead, it returns a future result. Since Future is a trait (which doesn’t have a size), we can’t return an unsized entity from the function and we have to wrap it in a Box. However, in this case, we used a brand new approach, which is the impl trait. This allows us to return an implementation of the trait by value, rather than by reference. Our future can be resolved to a hyper::Response<Body> item or a hyper::Error error type. You should import the necessary types if you’ve started a project from scratch and aren’t using the code examples included with this book:
use futures::{future, Future};
use hyper::{Body, Error, Method, Request, Response, Server, StatusCode};
use hyper::service::service_fn;
We also imported the Future trait from the futures crate. Make sure you’re either using edition = “2018” in the Cargo.toml file, or importing the crates in main.rs:
extern crate futures;
extern crate hyper;
We started by importing the types to the code, but we still have to import the crates in the Cargo.toml file. Add these crates in the dependency list of your Cargo.toml:
[dependencies]
futures = "0.1"
hyper = "0.12"
Everything is now ready to implement a service handler.
Implementing a service function
Our service function will support two kinds of requests:
- GET requests to the / path with an index page response
- Other requests with a NOT_FOUND response
To detect the corresponding method and path, we can use the methods of the Request object. See the following code:
fn microservice_handler(req: Request<Body>)
-> impl Future<Item=Response<Body>, Error=Error>
{
match (req.method(), req.uri().path()) {
(&Method::GET, "/") => {
future::ok(Response::new(INDEX.into()))
},
_ => {
let response = Response::builder()
.status(StatusCode::NOT_FOUND)
.body(Body::empty())
.unwrap();
future::ok(response)
},
}
}
I used a match expression to detect the corresponding method returned from the req.method() function, and also the path of the URI of the Request returned by the req.uri().path() method’s chain call.
The method() function returns a reference to the Method instance. Method is an enumeration that contains all supported HTTP methods. Instead of other popular languages, which return strings for methods, Rust uses a strict set of methods from a finite enumeration. This helps to detect typos during compilation.
The Future instances created with the future::ok function are also returned. This function immediately resolves the future to a successful result with an item of the corresponding type. This is useful for static values; we don’t need to wait to create them.
The future object is a long operation that won’t return a result immediately. The runtime will poll the future until it returns the result. It’s useful to perform asynchronous requests on a database.
We can also return streams instead of a whole result. The futures crate contains a Stream trait for those cases.
In our match expression, we used Method::GET and the ”/” path to detect requests of the index page. In this case, we’ll return a Response that constructs a new function and an HTML string as an argument.
In case no pages were found that match the _ pattern, we’ll return a response with the NOT_FOUND status code from the StateCode enumeration. This contains all of the status codes of the HTTP protocol.
We use the body method to construct the response, and we used an empty Body as an argument for that function. To check that we haven’t used it before, we use unwrap to unpack the Response from the Result.
Index pages
The last thing we need is an index page. It’s considered good form to return some information about a microservice when requested, but you may hide it for security reasons.
Our index page is a simple string with HTML content inside:
const INDEX: &'static str = r#"
<!doctype html>
<html>
<head>
<title>Rust Microservice</title>
</head>
<body>
<h3>Rust Microservice</h3>
</body>
</html>
"#;
This is a constant value that can’t be modified. Рay attention to the start of the string, r#”, if you haven’t used it before. This is a kind of multiline string in Rust that has to end with ”#.
Full Example
extern crate futures;
extern crate hyper;
use futures::{future, Future};
use hyper::{Body, Error, Method, Request, Response, Server, StatusCode};
use hyper::service::service_fn;
const INDEX: &'static str = r#"
<!doctype html>
<html>
<head>
<title>Rust Microservice (yby)</title>
</head>
<body>
<h3>Rust Microservice (yby)</h3>
</body>
</html>
"#;
fn microservice_handler(req: Request<Body>)
-> impl Future<Item=Response<Body>, Error=Error>
{
match (req.method(), req.uri().path()) {
(&Method::GET, "/") => {
future::ok(Response::new(INDEX.into()))
},
_ => {
let response = Response::builder()
.status(StatusCode::NOT_FOUND)
.body(Body::empty())
.unwrap();
future::ok(response)
},
}
}
fn main() {
let addr = ([127, 0, 0, 1], 8080).into();
let builder = Server::bind(&addr);
let server = builder.serve(|| service_fn(microservice_handler));
let server = server.map_err(drop);
hyper::rt::run(server);
}
Implementing the REST principles
If everyone were to create rules of interaction with microservices from scratch, we’d have an excess of private standards of intercommunication. RESTful isn’t a strict set of rules, but it’s an architectural style intended to make interacting with microservices simple. It provides a suggested set of HTTP methods to create, read, update, and delete data; and perform actions. We’ll add methods to our service and fit them to REST principles.
Adding a shared state
You may have already heard that shared data is a bad thing and a potential cause of bottlenecks, if it has to be changed from separate threads. However, shared data can be useful if we want to share the address of a channel or if we don’t need frequent access to it. In this section, we need a user database. In the following example, I’ll show you how to add a shared state to our generator function. This approach can be used for a variety of reasons, such as keeping a connection to a database.
A user database will obviously hold data about users. Let’s add some types to handle this:
type UserId = u64;
struct UserData;
UserId represents the user’s unique identifier. UserData represents the stored data, but we use an empty struct for serialization and parsing streams in this example.
Our database will be as follows:
type UserDb = Arc<Mutex<Slab<UserData>>>;
Arc is an atomic reference counter that provides multiple references to a single instance of data (in our case, this is the Mutex over the slab of data). Atomic entities can be safely used with multiple threads. It uses native atomic operations to prohibit the cloning of the reference. This is because two or more threads can corrupt the reference counter and can cause segmentation faults, leading to data loss or a memory leak if the counter was greater than the references in the code.
Mutex is a mutual-exclusion wrapper that controls access to mutable data. Mutex is an atomic flag that checks that only one thread has access to the data and other threads have to wait until the thread that has locked the mutex releases it. NOTE: You have take into account that if you have a locked Mutex in one thread and that thread panics, the Mutex instance become poisoned, and if you try to lock it from another thread, you’ll get an error.
You may be wondering why we reviewed these types if the asynchronous server can work in a single thread. There are two reasons.
- First, you may need to run the server in multiple threads for scaling.
- Second, all types that provide interaction facilities, such as Sender objects (from a standard library, a Future Trait, or anywhere else) or database connections, are often wrapped with these types to make them compatible with a multithreading environment. It can be useful to know what’s going on under the hood.
You might be familiar with Standard Library types, but Slab may seem a little different. This type can be thought of as a silver bullet in web-server development. Most pools use this appliance. Slab is an allocator that can store and remove any value identified by an ordered number. It can also reuse the slots of removed items. It’s similar to the Vec type, which won’t resize if you remove the element, but will reuse free space automatically. For servers, it’s useful to keep connections or requests, such as in the JSON-RPC protocol implementation.
In this case, we use Slab to allocate new IDs for users and to keep the data with the user. We use Arc with the Mutex pair to protect our database of data race, because different responses can be processed in different threads, which can both try to access the database. In fact, Rust won’t let you compile the code without these wrappers.
We have to add an extra dependency, because the Slab type is available in the external slab crate. Add this using Cargo.toml:
[dependencies]
slab = "0.4"
futures = "0.1"
hyper = "0.12"
Import these necessary types in the main.rs file:
use std::fmt;
use std::sync::{Arc, Mutex};
use slab::Slab;
use futures::{future, Future};
use hyper::{Body, Error, Method, Request, Response, Server, StatusCode};
use hyper::service::service_fn;
Let’s write a handler and a main function in the following section.
Accessing a shared state from a service function
To get access to a shared state, you need to provide a reference to the shared data. This is simple, because we’ve already wrapped our state with Arc, which provides us with a clone() function to duplicate the reference to the shared object.
Since our service function needs extra parameters, we have to rewrite the definition and call our microservice_handler function. Now it has an extra argument, which is the reference to the shared state:
fn microservice_handler(req: Request<Body>, user_db: &UserDb)
-> impl Future<Item=Response<Body>, Error=Error>
We also have to send this expected reference to the main function:
fn main() {
let addr = ([127, 0, 0, 1], 8080).into();
let builder = Server::bind(&addr);
let user_db = Arc::new(Mutex::new(Slab::new()));
let server = builder.serve(move || {
let user_db = user_db.clone();
service_fn(move |req| microservice_handler(req, &user_db))
});
let server = server.map_err(drop);
hyper::rt::run(server);
}
As you can see, we created a Slab and wrapped it with Mutex and Arc. After that, we moved the object, called user_db, into the serve function call of the server builder that’s using the move keyword. When the reference moves into the closure, we can send it to microservice_handler. This is a handler function called by a closure sent to the service_fn call. We have to clone the reference to move it to a nested closure, because that closure can be called multiple times. We shouldn’t move the object completely, however, because a closure sent to the serve function can be called multiple times and so the runtime might need the object again later.
In other words, both closures can be called multiple times. The closure of service_fn will be called in the same thread as the runtime, and we can use a reference for the value inside it.
Parsing paths in a microservice
A common task in web development is to use functions that work with persistent storage. These functions are often called create, read, update, and delete (CRUD) functions. They are the most common operations with data.
We can implement a CRUD set for our service, but first we have to identify the entity that we want to work with. Imagine that we need three types of entities: users, articles, and comments. In this case, I recommend that you separate the microservices, because the users microservice is responsible for identity, the articles microservice is responsible for the content, and the comments microservice handles content. However, you would get more benefits if you could reuse these entities for more than one context.
Before we implement all the handlers, we need a helper function that creates empty responses with the corresponding HTTP status codes:
fn response_with_code(status_code: StatusCode) -> Response<Body> {
Response::builder()
.status(status_code)
.body(Body::empty())
.unwrap()
}
This function carries out a few simple actions – it expects a status code, creates a new response builder, sets that status, and adds an empty body.
We can now add a new request handler that checks three path variants:
- The index page (path /)
- Actions with user data (prefix user)
- Other paths
We can use the match expression to fulfill all of these cases. Add the following code to the microservices_handler function:
let response = {
match (req.method(), req.uri().path()) {
(&Method::GET, "/") => {
Response::new(INDEX.into())
},
(method, path) if path.starts_with(USER_PATH) => {
unimplemented!();
},
_ => {
response_with_code(StatusCode::NOT_FOUND)
},
}
};
future::ok(response)
As you can see, we used an if expression in the second branch to detect that the path starts with the user prefix. This prefix is actually stored in the USER_PATH constant:
const USER_PATH: &str = "/user/";
Unlike the previous example, in this case we’ll use our brand new response_with_code function to return a NOT_FOUND HTTP response. We also assign a response to the response variable and use it to create a Future instance with the future::ok function.
Implementing REST methods
Our microservices can already distinguish between different paths. All that’s left is to implement request handling for the users’ data. All incoming requests have to contain the user prefix in their paths.
Extracting the user’s identifier
To modify a specific user, we need their identifier. REST specifies that you need to get the IDs from a path, because REST maps data entities to URLs.
We can extract a user’s identifier using the tail of the path, which we already have. This is why we use the starts_with method of the string, instead of checking for strong equality with USER_PATH to the path tails.
We previously declared the UserId type, which equals the u64 unsigned number. Add this code to the second branch of the previously-declared match expression with the (method, path) pattern to extract the user’s identifier from the path:
let user_id = path.trim_left_matches(USER_PATH)
.parse::<UserId>()
.ok()
.map(|x| x as usize);
The str::trim_left_matches method removes the part of the string if it matches a provided string from the argument. After that, we use the str::parse method, which tries to convert a string (the remaining tail) to a type that implements the FromStr trait of the standard library. UserId already implements this, because it’s equal to the u64 type, which can be parsed from the string.
The parse method returns Result. We convert this to an Option instance with Result::ok functions. We won’t try to handle errors with the IDs. The None value represents either the absence of a value or a wrong value.
We can also use a map of the returned Option instance to convert a value to the usize type. This is because Slab uses usize for IDs, but the real size of the usize type depends on the platform architecture, which can be different. It can be u32 or u64 depending on the largest memory address that you can use.
Why can’t we use usize for UserId since it implements the FromStr trait? This is because a client expects the same behavior as an HTTP server, which doesn’t depend on the architecture platform. It’s bad practice to use unpredictable size parameters in HTTP requests.
Sometimes, it can be difficult to choose a type to identify the data. We use map to convert the u64 value to usize. This doesn’t work, however, for architectures where usize equals u32, because UserId can be larger than the memory limit. It’s safe in cases where the microservices are tiny, but this is an important point to bear in mind for microservices that you’ll use in production. Often, this problem will be simple to solve, because you can use the ID type of a database.
Getting access to the shared data
In this user handler, we need access to a database with users. Because the database is a Slab instance that’s wrapped with a Mutex instance, we have to lock the mutex to have exclusive access to a slab. There’s a Mutex::lock function that returns Result<MutexGuard, PoisonError<MutexGuard>>. MutexGuard is a scoped lock, which means it leaves the code block or scope in, and it implements the Deref and DerefMut traits to provide transparent access to data under the guard object.
It’s a good practice to report all errors in the handler. You can log errors and return a 500 (Internal Error) HTTP code to the client. To keep it simple, we’ll use an unwrap method and expect the mutex to lock correctly:
let mut users = user_db.lock().unwrap();
Here, we locked the Mutex for the duration of generating the request. In this case, where we’re creating whole responses immediately, this is normal. In cases where the result is delayed or when we work with a stream, we shouldn’t lock the mutex all time. This will create a bottleneck for all requests because the server can’t process requests in parallel if all of them depend on a single shared object. For cases where you don’t have results immediately, you can clone the reference to the mutex and lock it for the short time you need access to the data.
REST methods
We want to cover all basic CRUD operations. Using the principles of REST, there are suitable HTTP methods that fit these operations—POST, GET, PUT, and DELETE. We can use the match expression to detect the corresponding HTTP method:
match (method, user_id) {
// Put other branches here
_ => {
response_with_code(StatusCode::METHOD_NOT_ALLOWED)
},
}
Here, we used a tuple with two values—a method and a user identifier, which is represented by the Option<UserId> type. There is a default branch that returns the METHOD_NOT_ALLOWED message (the 405 HTTP status code) if a client requests an unsupported method.
Let’s discuss every branch of match expression for every operation.
POST – Creating data
When the server has just started, it doesn’t contain any data. To support data creation, we use the POST method without the user’s ID. Add the following branch to the match (method, user_id) expression:
(&Method::POST, None) => {
let id = users.insert(UserData);
Response::new(id.to_string().into())
}
This code adds a UserData instance to the user database and sends the associated ID of the user in a response with the OK status (an HTTP status code of 200). This code was set by the Response::new function by default.
What if the client sets the ID with a POST request? You can interpret this case in two ways—ignore it or try to use the provided ID. In our example, we’ll inform the client that the request was wrong. Add the following branch to handle this case:
(&Method::POST, Some(_)) => {
response_with_code(StatusCode::BAD_REQUEST)
}
This code returns a response with the BAD_REQUEST status code (a 400 HTTP status code).
GET – Reading data
When data is created, we need to be able to read it. For this case, we can use the HTTP GET method. Add the following branch to the code:
(&Method::GET, Some(id)) => {
if let Some(data) = users.get(id) {
Response::new(data.to_string().into())
} else {
response_with_code(StatusCode::NOT_FOUND)
}
}
This code uses the user database to try to find the user by the ID that’s provided in the path. If the user is found, we’ll convert its data to a String and into a Body to send with a Response.
If the user isn’t found, the handler branch will respond with the NOT_FOUND status code (the classic 404 error).
To make the UserData convertible to a String, we have to implement the ToString trait for that type. However, it’s typically more useful to implement the Display trait, because ToString will be derived automatically for every type that implements the Display trait. Add this code somewhere in the main.rs source file:
impl fmt::Display for UserData {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.write_str("{}")
}
}
In this implementation, we return a string with an empty JSON object ”{}“. Real microservices have to use the serde trait for such conversions.
PUT – Updating data
Once the data is saved, we might want to provide the ability to modify it. This is a task for the PUT method. Use this method to handle changes to the data:
(&Method::PUT, Some(id)) => {
if let Some(user) = users.get_mut(id) {
*user = UserData;
response_with_code(StatusCode::OK)
} else {
response_with_code(StatusCode::NOT_FOUND)
}
},
This code tries to find a user instance in the user database with the get_mut method. This returns a mutable reference wrapped with either a Some option, or a None option if the corresponding value isn’t found. We can use a dereference operator, *, to replace the data in the storage.
If the user’s data was found and replaced, the branch returns an OK status code. If there’s no user with the requested ID, the branch returns NOT_FOUND.
DELETE – Deleting data
When we don’t need data anymore, we can delete it. This is the purpose of the DELETE method. Use it in the branch as follows:
(&Method::DELETE, Some(id)) => {
if users.contains(id) {
users.remove(id);
response_with_code(StatusCode::OK)
} else {
response_with_code(StatusCode::NOT_FOUND)
}
},
This code checks whether the Slab contains the data and removes it with the remove method. We don’t use the remove method right away because this expects the data to exist in the storage beforehand, and therefore panics if the data is absent.
NOTE: Often, web services don’t actually remove data and instead just mark it as deleted. This is a reasonable thing to do because it allows you to explore the data later and improve the efficiency of the service or the company. However, this is a risky practice. Users should be able to remove their data completely, because sensitive data can represent a threat. New laws, such as the GDPR law (https://en.wikipedia.org/wiki/General_Data_Protection_Regulation), protect the user’s right to own their data and stipulate certain requirements for data protection. Violation of such laws may result in a fine. It’s important to remember this when you work with sensitive data.
Routing advanced requests
In the preceding example, we used pattern matching to detect the destination of a request. This isn’t a flexible technique, because the path often contains extra characters that have to be taken into account. The user/1 path, for example, contains the trailing slash, /, which can’t be parsed with a user ID in the previous version of our microservice. There’s a flexible tool to fix this issue: regular expressions. Defining paths with regular expressions
A regular expression is a sequence of characters that express a pattern to be searched for in a string. Regular expressions provide you with the ability to create tiny parsers that split a text into parts using a formal declaration. Rust has a crate called regex, a popular abbreviation of regular expression collocation. You can learn more about this crate here: https://crates.io/crates/regex.
Adding the necessary dependencies
To use regular expressions in our server, we need two crates: regex and lazy_static. The first provides a Regex type to create and match regular expressions with strings. The second helps to store Regex instances in a static context. We can assign constant values to static variables, because they’re created when a program loads to memory. To use complex expressions, we have to add an initialization code and use it to execute expressions, assigning the result to a static variable. The lazy_static crate contains a lazy_static! macro to do this job for us automatically. This macro creates a static variable, executes an expression, and assigns the evaluated value to that variable. We can also create a regular expression object for every request in a local context using a local variable, rather than a static one. However, this takes up runtime overhead, so it’s better to create it in advance and reuse it.
Add both dependencies to the Cargo.toml file:
[dependencies]
slab = "0.4"
futures = "0.1"
hyper = "0.12"
lazy_static = "1.0"
regex = "1.0"
Add two imports, in addition to the imports in the main.rs source file from the previous example:
use lazy_static::lazy_static;
use regex::Regex;
We’ll use the lazy_static macro and the Regex type to construct a regular expression.
Writing regular expressions
Regular expressions contain a special language, used to write a pattern to extract data from a string. We need three patterns for our example:
- A path for the index page
- A path for user management
- A path for the list of users (a new feature for our example server)
There’s a Regex::new function that creates regular expressions. Remove the previous USER_PATH constant and add three new regular expression constants in a lazy static block:
lazy_static! {
static ref INDEX_PATH: Regex = Regex::new("^/(index\\.html?)?$").unwrap();
static ref USER_PATH: Regex = Regex::new("^/user/((?P<user_id>\\d+?)/?)?$").unwrap();
static ref USERS_PATH: Regex = Regex::new("^/users/?$").unwrap();
}
As you can see, regular expressions look complex. To understand them better, let’s analyze them.
Path for index page
The INDEX_PATH expression matches the following paths:
- /
- /index.htm
- /index.html
The expression that fits these paths is ”^/(index\\.html?)?$“.
The ^ symbol means there must be a string beginning, while the $ symbol means there must be a string ending. When we place these symbols on either side, we prevent all prefixes and suffixes in the path and expect exact matching.
The ( ) brackets implies there must be a group. An expression in a group is treated as an indivisible unit.
The ? symbol means that the previous character is optional. We place it after the l character to allow the file in the path to have both .htm and .html extensions. As you’ll see later, we don’t have an index file to read. We use it as an alias of the root path handler. The question mark is also used after a whole group with a file name to fit the empty root path, /.
The dot symbol (.) fits any character, but we need a real dot symbol. To treat a dot as a symbol, we have to add a backslash (\) before it. A single backslash, however, will be interpreted as a beginning-of-escape expression, so we have to use pair of backslashes (\\) to make the backslash a plain symbol.
All other characters are treated as is, including the / symbol.
Path for user management
The USER_PATH expression can fit the following paths:
- user
- /user/<id>, where <id> means group of digits
- user/<id>, the same as the previous one, but with a trailing backslash
These cases can be handled with the ”^/user/((?P<user_id>\\d+?)/?)?$” regular expression. This expression is a bit complex. It includes two groups (one is nested) and some other strange characters. Let’s have a closer look.
?P<name> is a grouping attribute that sets the name of the capturing group. Every group in brackets can be accessed by the regex::Captures object. Named groups can be accessed by names.
\\d is a special expression that matches any digit. To specify that we have one or more digits, we should add the + symbol, which tells us how many repetitions it may have. The * symbol can also be added, which tells us that there are zero or more repetitions, but we haven’t used this in our regular expression.
There are two groups. The first is nested with the name user_id. It must include digits only to be parsed to the UserId type. The second is an enclosing group that contains the optional trailing slash. This whole group is optional, meaning that the expression can include a user path without any identifier.
Path for the users list
The USERS_PATH is a new pattern, which we didn’t have in the previous example. We’ll use it to return a full list of users on the server. This pattern fits only two variants of the path:
- users (with a trailing slash)
- /users (without a trailing slash)
The regular expression to handle these cases is quite simple: ”^/users/? symbol.
Matching expressions
We have to reorganize the code of microservice_handler because we can’t use regular expressions in a match expression. We have to extract the method with the path at the start, because we need it for most responses:
let response = {
let method = req.method();
let path = req.uri().path();
let mut users = user_db.lock().unwrap();
// Put regular expressions here
};
futures::ok()
The first thing we’ll check is the index page requests. Add the following code:
if INDEX_PATH.is_match(path) {
if method == &Method::GET {
Response::new(INDEX.into())
} else {
response_with_code(StatusCode::METHOD_NOT_ALLOWED)
}
This uses the INDEX_PATH regular expression to check whether the request’s path matches the index page request using the Regex::is_match method, which returns a bool value. Here, we’re checking the method of a request, so only GET is allowed.
We’ll then continue the if clause with an alternative condition for the user list request:
let response = {
let method = req.method();
let path = req.uri().path();
let mut users = user_db.lock().unwrap();
// Put regular expressions here
};
futures::ok()
The first thing we’ll check is the index page requests. Add the following code:
if INDEX_PATH.is_match(path) {
if method == &Method::GET {
Response::new(INDEX.into())
} else {
response_with_code(StatusCode::METHOD_NOT_ALLOWED)
}
This uses the INDEX_PATH regular expression to check whether the request’s path matches the index page request using the Regex::is_match method, which returns a bool value. Here, we’re checking the method of a request, so only GET is allowed.
We’ll then continue the if clause with an alternative condition for the user list request:
} else if USERS_PATH.is_match(path) {
if method == &Method::GET {
let list = users.iter()
.map(|(id, _)| id.to_string())
.collect::<Vec<String>>()
.join(",");
Response::new(list.into())
} else {
response_with_code(StatusCode::METHOD_NOT_ALLOWED)
}
This code uses the USERS_PATH pattern to check whether the client requested the list of users. This is a new path route. After this, we iterate over all the users in the database and join their IDs in a single string.
The following code is used to handle REST requests:
} else if let Some(cap) = USER_PATH.captures(path) {
let user_id = cap.name("user_id").and_then(|m| {
m.as_str()
.parse::<UserId>()
.ok()
.map(|x| x as usize)
});
// Put match expression with (method, user_id) tuple
This code uses the USER_PATH and the Regex::captures method. It returns a Captures object with the values of all captured groups. If the pattern doesn’t match the method, it returns a None value. If the pattern does match, we get an object stored in the cap variable. The Captures struct has the name method to get a captured value by name. We use the user_id as the name of the group. This group can be optional and the name method returns an Option. We use the and then method of the Option to replace it with the parsed UserId. Finally, the user_id variable takes the Option<UserId> value, in the same way as the previous version of our microservice. To avoid repetition, I skipped the block where the request is the same as the (method, user_id) tuple – just copy this part from the example in the previous section of this chapter.
The last part is a default handler that returns a response with a NOT_FOUND status code:
} else {
response_with_code(StatusCode::NOT_FOUND)
}
The service is now complete.
Testing:
$ curl -X POST http://localhost:8080/user/
0
$ curl -X POST http://localhost:8080/user/
1
$ curl -X POST http://localhost:8080/user/
2
$ curl -X DELETE http://localhost:8080/user/1
$ curl http://localhost:8080/users
0,2