Introduction - Lesson 2 - Automation in Microreactor Systems - Microreactors in Organic Chemistry and Catalysis, Second Edition (2013)

Microreactors in Organic Chemistry and Catalysis, Second Edition (2013)

4. Automation in Microreactor Systems

Jason S. Moore, and Klavs F. Jensen

4.1. Introduction

Microreactors confer a number of advantages over more traditional macro-scale and batch systems. With channel widths of the order of hundreds of micrometers, diffusion rapidly transforms large concentration gradients into a uniform field. This rapid mixing can greatly increase reaction rates for systems typically limited by mass transfer under standard conditions [1, 2]. Additionally, the large wall surface-area-to-volume ratio and high thermal conductivity of silicon allow for improved heat transfer [3–5], allowing for more accurate temperature control [1] and reduced internal temperature gradients [6]. Mixing can be further enhanced by the addition of an immiscible phase, causing Taylor recirculation [7] and reducing axial dispersion [8], which can have significant effects at short residence times [9]. In the ideal case of complete separation of the liquid segments, they each function as individual well-mixed batch reactors.

Silicon microreactors have been used in number of flow systems, such as gas–liquid [10], liquid–liquid [11], and gas–liquid–solid [12] reactions, which are often used in conjunction with other unit operations, such as extraction [13] and distillation [14]. Furthermore, silicon microreactors have been shown to be able to operate at high temperatures and pressures [15]. In addition, the small volumes of microreactors allow potentially dangerous chemistry to be conducted more safely. For example, fluorination and chlorination of aromatics, nitration to form highly energetic compounds, and reactions carried out in the explosive regime have all been conducted safely in microreactors [2].

Throughout the chemical industry, the emphasis on continuous processes continues to grow due to increased process safety, reduced costs, and improved product quality [16] as well as reduced environmental impact [17]. In the area of pharmaceuticals, the change from batch to continuous production is less advanced, in part due to a strict level of regulatory requirements, which has favored traditional batch processes. Nevertheless, the Food and Drug Administration (FDA) has recently created an initiative promoting the use of in-line process analytical technology “facilitating continuous processing to improve efficiency and manage variability” and to increase “process understanding” and automation to aid in drug screening and scale-up [18].

As each stage of clinical trials and the final market production requires larger quantities of drug to be made, the ability to scale-up efficiently becomes important to whether a product can move forward. Batch processes are not easily scaled up; for example, a stirred reactor can be scaled up based upon constant impeller power input per unit volume, impeller rotation frequency, or impeller tip speed, all of which generally lead to significantly different power requirements and mixing conditions. Moreover, as there is a finite time before patents expire, the speed at which scale-up steps can be accomplished translates directly into additional time on the market before generics significantly reduce market share [19]. Knowing kinetics and thermodynamics of the underlying reactions would allow process simulations to be performed to determine the optimal operating conditions under the effects of increased mass transfer limitations [6].

Further cost savings can be realized by increasing automation in the processes of drug development using microreactors. Automating the optimization of syntheses, especially those that are multi-step or those containing several unit operations, would significantly reduce the labor, time, and materials required [20]. Intelligently updating of reaction conditions based on in-line analytical techniques has been shown to significantly improve optimization performance [21]. Additionally, automation can be applied to the kinetic studies as well as screening experiments to reap greater savings.

A number of control and measurement techniques can be incorporated into microreactor systems to automate experiments and increase throughput. Temperature, residence time, and reactant concentration can be easily controlled and measured online. A number of methods exist in particular for measuring reactant concentration. Fluorescence, UV–Vis, Fourier transform infrared spectroscopy (FTIR), Raman, nuclear magnetic resonance (NMR), chromatography, and mass spectrometry measurements have all been implemented in-line to monitor reactions and allow for increased automation [22]. In one example, an external Raman probe was used to scan residence times down the length of the reactor [23]. In another, a reactor flow path with paraffin walls was created with an attenuated total reflectance ATR–FTIR crystal as its base, allowing for the generation of eight frames per second movies of concentration fields along the entire flow path [24]. Expanding upon these techniques could result in significant advances in continuous in-line monitoring in the future.

A great diversity exists in the field of optimization [25, 26]. A number of optimization techniques have been based upon an algorithm that begins at a specified initial condition and then steps generally along a trajectory designed to climb toward a single optimum. Other techniques are based around global search algorithms, either dividing the parameter space into a number of branches and then further subdividing them as the optimization progresses or using concepts found in nature to incorporate elements of structured randomness, such as simulated annealing [27] and genetic algorithms [28, 29].

The stepping trajectory-based algorithms, such as simplex [30] and steepest descent [31] methods, are designed as more directed searches than the other types of methods. The simplex method in n dimensions is based around an n+ 1 vertex hyper-triangle where experiments are performed at each vertex, and then, the value of an objective function at each point is calculated. Whichever vertex has the worst value of the function is discarded and that point is reflected through the hyper-triangle as the next set of conditions to be tested. This process is repeated as the algorithm moves uphill in the objective function. Once the simplex reaches a point where reflection of each vertex results in a poorer resulting function value, the algorithm then contracts either by moving inward from one vertex or by keeping only the best point and then moving all other vertices inward [25, 30].

Gradient-based methods, such as the steepest descent method (Equation 4.1a), begin from an initial guess and move a distance along the gradient of the objective function to the next set of conditions. Unlike the simplex method where the step size is initially set and remains unchanged until the area of the optimum is reached, the steepest descent method may vary the step size for each step [31]. However, the steepest descent method often results in inefficient zigzag trajectories in narrow valleys or hills, requiring more advanced methods like conjugate gradient (Equation 4.2a) to determine search directions [25].

(4.1a) equation

(4.1b) equation

(4.2a) equation

(4.2b) equation

(4.2c) equation

equation

The SNOBFIT (Stable Noisy Optimization by Branch and Fit) algorithm by Huyer and Neumaier [32] is one example of an algorithm that attempts to find the global maximum by partitioning the parameter space into several local searches. The algorithm first attempts to increase the chance of finding a global optimum by creating a space-filling set of initial points where the objective function will be evaluated. The algorithm then creates local quadratic models around each point using a set of the points nearest neighbors. The minimum of each model is calculated, and if the objective function has not already been evaluated at these points, a set number of the best points are added to the next list of points to be calculated. If the number of points to be evaluated in the next step is lower than a set number of points, additional random data points in unexplored areas are added. The major advantage of this algorithm is that unlike the algorithms in the previous section, SNOBFIT is not a greedy algorithm. Thus, it will not become trapped in a local optimum that is not the global optimum. However, while this algorithm is able to span the entire parameter space and find a global optimum, such convergence is only guaranteed as the number of function evaluations become large [32].

One example of nature-mimicking optimizations is the mimetic algorithm, which incorporates aspects of more traditional trajectory-based hill climbing with a genetic algorithm. The genetic algorithm creates an initial random population of conditions and codes each into an array, which serves as a chromosome analog. Once the value of the objective function is evaluated at each condition, each is given a fitness score. Higher scores increase the likelihood that the corresponding chromosome will be included in a list from which the next generation will be created, which is done by mimicking the crossing-over and mutation steps of meiosis. A variety of modifications can be made to this algorithm, such as using more than two parents per offspring and including a so-called incest prevention algorithm, which checks the Hamming distance to make sure that two parents that are much more closely related than the rest of the population cannot breed to prevent premature convergence. Once the genetic algorithm has narrowed in on the area containing the optimum, another optimization technique, such as steepest descent, is used to fine-tune the optimum [29].

A more probabilistic algorithm is simulated annealing, which is designed around principles of statistical mechanics and mimics the movement of atoms at a certain temperature. After a random starting position, each subsequent step is generated by a random displacement. The objective function is modeled as an energy surface. If the resulting objective function is more favorable, the step is accepted; however, if the objective function is less favorable, the step may still be accepted with a probability proportional to the negative exponent of the function change over an effective temperature. Initially, the temperature is set high enough that the optimization is able to move between local optima, but as the optimization proceeds, this temperature is lowered, which causes the optimization to become trapped in one optimum [27]. Similar to this cooling down process, the mutation rate of the genetic algorithm can be set initially high and made to decrease with each generation [28]. The major downside of both of these algorithms is that they generally require a large number of evaluations of the objective function, making them untenable for experimental optimization if the experiments cannot be done on the order of seconds.