Chain of Refined Perception in Self-optimizing Assembly of Micro-optical Systems

Today, the assembly of laser systems requires a large share of manual operations due to its complexity regarding the optimal alignment of optics. Although the feasibility of automated alignment of laser optics has been shown in research labs, the development effort for the automation of assembly does not meet economic requirements – especially for low-volume laser production. This paper presents a model-based and sensor-integrated assembly execution approach for flexible assembly cells consisting of a macro-positioner covering a large workspace and a compact micromanipulator with camera attached to the positioner. In order to make full use of available models from computer-aided design (CAD) and optical simulation, sensor systems at different levels of accuracy are used for matching perceived information with model data. This approach is named " chain of refined perception " , and it allows for automated planning of complex assembly tasks along all major phases of assembly such as collision-free path planning, part feeding, and active and passive alignment. The focus of the paper is put on the in-process image-based metrology and information extraction used for identifying and calibrating local coordinate systems as well as the exploitation of that information for a part feeding process for micro-optics. Results will be presented regarding the processes of automated calibration of the robot camera as well as the local coordinate systems of part feeding area and robot base.


Introduction
Optical systems and lasers belong to high-technology sectors with high technical and economic potential in the near future.Especially, the laser industry is regarded to be highly innovative with a leverage effect on other industrial branches.New and improved products are developed and brought to the market frequently.Diode laser systems have a market share of about 50 %1 , and they are characterized by good energy efficiency and small size on the one hand and a relatively large beam divergence angle on the other.The latter requires the challenging assembly of collimation optics.The scope of this paper addresses the assembly of micro-optics and especially the assembly of collimation optics in diode laser systems.
The alignment of micro-optics requires ultra-high precision in up to six degrees of freedom.For meeting the demands of the alignment task, active alignment needs to be applied, which means that relevant beam characteristics are monitored and evaluated during the alignment process of the optics.The observed values are processed cognitively by the operator or by dedicated program logic.Due to its complexity, industrial assembly of high-technology diode laser systems is dominated by manual processes, which determine the majority of overall production costs.The feasibility of automation of such assembly tasks has been proven in several research projects (Brecher, 2012;Haag and Härer, 2012;Loosen et al., 2011;Pierer et al., 2011;Miesner et al., 2009) as well as in a few industrial applications.A breakthrough of automation in this field has not yet been achieved.Mainly, economic reasons are accountable for this situation as in many business cases there are relatively small production volumes so that automation is not profitable due to a large portion of non-operational times caused by planning, commissioning and frequent changeovers.In recent years, flexible assembly systems for optics assembly have been developed (Brecher, 2012;Haag and Härer, 2012) aiming for shorter non-operational times and hence for higher machine utilization times in scenarios with many product variants.Flexibility has mainly been achieved through modularization of tools and standardization of mechanical interfaces.Distributed multi-agent systems have been implemented in order to provide flexible architectures for assembly execution.Further flexibility can be achieved through interaction of a machine with its environment.This requires sensor integration allowing the perception of the environment (Russell et al., 2010).
Higher flexibility and sensor integration lead to higher complexity, which is a challenge regarding the efficient planning and commissioning of alignment processes for optics.The work presented in this paper is motivated by the current discrepancy between the benefits of flexibility and sensor integration and the increased complexity.Therefore, this paper shows how 2-D bin-picking of micro-optics in part feeding can efficiently be realized and embedded in a model-based control scenario using low-cost hardware.

Chain of refined perception
The integration of sensors allows the perception of crucial process data and its use for optimizing individual steps as well as the overall result of the assembly task.For making full use of available models such as geometric model of the product and the assembly cell from computer-aided design (CAD) or the optical setup from ray-tracing simulation, coarse information from the large workspace and highresolution information from local regions has to be evaluated.In most cases it is inconvenient or even impossible to use high-precision sensors, which cover a large workspace at the same time.
This section introduces the architecture of flexible assembly systems and the principles of self-optimizing optics assembly on which the concept of this work is built.The final part of this section introduces the chain of refined perception, which is applied for the model-based execution of optics assembly.

Flexible assembly cell concept for micro-optical systems
Flexible assembly systems for micro-optics usually combine a macro-workspace covering a large area by a robot or gantry with a micro-workspace in which a micromanipulator locally carries out sensor-guided high-resolution motion in the submicrometer range (Brecher et al., 2012).
In previous research projects, modular micromanipulators with three or six degrees of freedom have been developed to enable common robotic systems and gantries to carry out micro-optical assembly.Additional modules such as cameras can be integrated in the micromanipulator in order to monitor the grasped part or the grasping area (Brecher et al., 2013).Additionally, such mobile cameras can be used to detect local reference marks for the calibration of spatial relations between local coordinate systems as will be described in the following sections.Schmitt et al. (2008) describe a multi-agent system for providing the required flexibility regarding the control architecture of a flexible assembly system.

Self-optimizing assembly of laser optics
One focus of the research in the Cluster of Excellence "Integrated Production Technology for High-wage Countries" at RWTH Aachen University2 is put on self-optimizing assembly systems, which aim for the reduction of planning efforts for complex and sensor-based assembly tasks (Brecher, 2012).Self-optimizing assembly of laser optics is applied for the production of high-quality laser modules coping with finite positioning accuracy of the actuation system, noisy perception, and tolerances of laser beam sources and optics.Therefore, model-based approaches for assembly execution under the presence of uncertainties are investigated.
Conceptually, self-optimizing systems follow a three-step cycle.Firstly, the current situation is analyzed considering the objective of the task, the current state of the assembly system including the product as well as a knowledge base holding additional information provided prior to assembly or collected during assembly execution.Secondly, internal objectives such as reference values for internal closed-loop controls are adapted based on reasoning on the analysis carried out in the previous step.This step goes beyond the classical definition of closed-loop controls and adaptive closed-loop controls.In a third step, self-optimizing systems adapt their behavior either through parameter adaption or through structural changes.
Hence, key aspects of self-optimizing assembly systems are model-based control and sensor integration.Model-based control allows automatisms during the planning phase and therefore drastically reduced planning times.Yet the approach requires the use of sensors in order to identify and compensate differences between ideal models and real-world situations.
Figure 1 shows the reduced ontology of a model-based self-optimizing assembly system.Different types of models such as product models (e.g., geometry, optical function), production system models (e.g., kinematic chains) as well as process knowledge and system objectives provide information for the cognition unit to select and configure algorithms and program logic.For example, the product model might provide a certain geometrical or optical constraint to be fulfilled by an assembly step.The cognition unit selects a certain type of mounting sequence consisting of a standardized sequence of steps (mounting template) such as part pickup, dosing of adhesives, active alignment, etc.In the context of this paper the part pickup is of special interest.The rough coordinates of the optical element can be retrieved from the geometrical model of the production system.The small size and the presence of uncertainties require the localization of the part with a precision sufficient for part pickup.This paper presents a sensor-guided approach for localizing micro-optical parts realized on a low-cost robot-based assembly station.

Chaining of process steps in micro-optical assembly
In order to overcome the gap between ideal models and uncertain reality, crucial assembly steps are implemented based on sensor guidance.Individual tasks during the assembly of micro-optics such as the pickup of parts or alignment of optics require different levels of accuracy ranging roughly from 10 mm measuring accuracy achieved by low-cost structured light sensors down to 100 nm positioning accuracy achieved through active alignment.Figure 2 shows the concept of a chain of refined perception as propagated by the work presented.For carrying out process steps, this concept uses several means of perception at different levels of granularity.The objective of one process step is to transform the assembly state to the tolerance level of the subsequent process step.The approach enables the advantages of planning complete assembly tasks based on models with the flexibility and precision of sensor-integrated systems.Figure 2 shows the chain of refined perception for the case of micro-optical assembly.
In the top level it covers a large workspace in the range of one or more cubic meters for autonomously planning collisionfree paths of the macro-positioner.In the bottom level, a motion resolution for optical alignment in the range of 10 nm is possible.
For the task of collision-free path planning, software tools such as MoveIt! as part of the ROS package3 have been developed in the robotics community.The work related to this paper applies such software in combination with structured light sensors such as Microsoft Kinect.The environment can be scanned in 3-D with such sensors.Additional point cloud processing software tools4 allow the matching of CAD models with the detected point clouds.The result is a collision model that allows the planning of collision-free paths.
For tasks such as 2-D bin-picking of micro-optical components, local coordinate systems need to be calibrated with reference to each other.Figure 3 shows a typical setup with a fixed camera and a mobile camera (the mobile camera and the micromanipulator it is attached to are carried by a positioning system such as a robot or a gantry).The fixed camera and its objective cover a large area such as a part carrier.The mobile camera covers a much smaller area intended for detecting local reference marks.The detection of defined reference marks allows the calibration of the cameras and their spatial relation.
Passive alignment is a step usually required prior to active alignment, and it is based on the detection of reference marks or geometric features using charge-coupled device (CCD) chips.During passive alignment parts are pre-positioned with reference to each other so that the initial starting point for active alignment is within a certain tolerance with high probability.
In the context of optics assembly, the task of active alignment accounts for the quality of the optical system.In the case of collimation optics, an optical measurement setup and a CCD chip are used for determination of the current state of alignment.Alignment algorithms have been and currently are subject to recent research activities (Brecher, 2012;Haag and Härer, 2012;Pierer et al., 2011;Miesner et al., 2009).

Calibration of stationary and mobile camera
As depicted above, two camera systems are used in the setup for the calibration of the positioner coordinate system with reference to the local coordinate system defined by local reference marks.The first camera is fixated perpendicular above the part carrier.The second is mounted to the micromanipulator, which is attached to the macro-positioner (cf.Fig. 3).In order to use image data as input for further calculations, both cameras have to be calibrated first.The calibration process allows compensating optical and perspective distortion and determining a scaling factor between image pixel and real-world metrics.

Calibration of stationary camera
The stationary camera system is equipped with a common entocentric optic and is appointed to monitor parts on the Gel-Pak magazine (part carrier).The predominant kinds of distortion consist of a radial barrel distortion, which is generally associated with the deployed kind of lens, and a trapezoidal distortion resulting from a misalignment of the camera with respect to its optimal perpendicular orientation.The scaling factor is calculated for the surface plane of the Gel-Pak because it depends on the object's distance to the camera.Determining and compensating camera distortion is a common task in computer vision.Hence, algorithms are widely available as frameworks in many programming languages.
The scaling factor can be obtained during the determination of the distortion or the local coordinate system simply by comparing known physical features like the calibration pattern or the distance between two reference marks with their representation in the image.
The calibration process has to be carried out when the position or orientation of the stationary camera changes.

Calibration of mobile camera
The mobile camera system is equipped with a telecentric lens to provide local image data of components during assembly tasks.Due to specific properties of telecentric lenses, there is no need to compensate any imaging deformations caused by perspective.Therefore, the calibration process only includes the identification of the relationship between the camera's local coordinate system and the robot's tool center point (TCP) (cf.Fig. 4).The telecentric lens is mounted approximately at the center of the robot's TCP while the fixed focus plane is tuned to be aligned with an attached gripper.For a mathematical coordinate transformation, four parameters have to be identified: the x and y offset of the image center from the z axis of the TCP, the camera orientation described by the angle between both x axes and finally the scaling factor to transform pixels into millimeters.
To obtain the scaling factor, the camera is positioned above a calibration pattern (dot target) with known physical features.The distance between two points in the image is then compared to its physical equivalent (cf.Fig. 5).
The angular offset is determined in a two-step approach.First, the robot camera is positioned above a reference mark.The camera image is then analyzed to determine its center point, which is stored along with the current robot coordinates.For the second step the robot is moved in a plane so that the reference mark stays in the image region, which will be analyzed again.The orientation can then be calculated by comparing the vector described by the movement of the robot with the movement of the reference mark in the image (cf.Fig. 6).
Due to the fact that the camera is positioned parallel to the z axis of the TCP, its x and y offset can be determined by stepwise rotation of the TCP and by analyzing the path of a reference mark in the image.The path is expected to describe : A reference mark describes a circle while the camera system is rotated.The pivot point fies the z-axes of the TCP. a circle that can be fitted to measured points.Its center point depicts the origin of the x and y axes of the TCP (cf.Fig. 7).

Calibration of local coordinate systems
In order to calculate the position and orientation of components in robot coordinates, a local coordinate system has to be defined first.Therefore two reference marks have been placed alongside the Gel-Pak to identify the origin and the direction of the y axes.The z axis is defined to be perpendicular to the surface pointing upwards.Finally, the x axis is positioned to complete a right-handed coordinate system.
The reference marks must be positioned in the image area of the stationary camera, in order to be identified and used to describe parts on the Gel-Pak in a local coordinate system.However, the distance of both reference marks should be maximized in order to minimize any error on the measured y axis orientation.
The local coordinate system can be automatically measured in positioner coordinates by moving the positioner with its mobile camera above each reference mark.Image data can then be analyzed to detect the center point of each reference mark.Based on the calibration of the mobile camera, coordinates can further be transformed into TCP coordinates and finally into positioner base coordinates.

2-D Bin-picking of micro-optical components
Part identification has been implemented as a stand-alone application.Through standard networking APIs, process control scripts can retrieve detailed information about every detected part on the magazine (cf.Fig. 8).OpenCV has been used for image processing.See Laganière (2011) and Bradski (2000) for reference.
Gel-Paks ® provide a convenient way to handle optical components during transport.Due to a proprietary elastomeric material, parts can be placed freely on the carrier and kept in position to ensure safe transportation and storage.Hence, this kind of magazine is a standard way of presenting optical components.Pickup positions can no longer be statically defined and therefore have to be identified through sensor evaluation.In the setup presented in this paper, a camera fixated above the Gel-Pak ® covers the complete lens presentation area as well as a set of reference marks in its field of view (cf.Fig. 9).Applying image processing, optics can be located in the local 2-D coordinate system.In order to carry out the robot-based pickup, the local coordinate system needs to be calibrated with respect to the positioner's base coordinate system as explained above.

2-D localization of micro-optical components
For an automated pickup process, optical components on the Gel-Pak have to be localized and identified.The localization step determines the x and y position and orientation of all parts in a local coordinate system.In a following step parts are distinguished and grouped by their type.This is accomplished by comparing visual features, which in combination allow a reliable identification of the investigated optical components.These features include without limitation the length and width, the visible area and its perimeter, as well as different ratios of these parameters.The grayscale histogram is  The left drawing explains the separation of blobs using dark field illumination on a cylindrical lens (only one direction of illumination is illustrated): only the rays hitting a specific region of the cylindrical surface will be reflected into the camera.This phenomenon occurs on both sides of the GRIN lens so that there are two separated blobs in image processing.also suitable to distinguish and group parts.Formed groups can finally be mapped to templates, which have to be configured only once for every new component type.
Part descriptions based on salient points are not suitable because of small and mostly homogenous surfaces, which do not offer many features.
The image segmentation is based on binary thresholding with a watershed algorithm to achieve accurate edges.In order to enhance the contrast between the Gel-Pak and mostly transparent optical components, dark field lighting has been introduced in the experimental setup.Occasionally, parts such as cylindrical GRIN optics lead to separated blobs, which have to be combined in a postprocessing step.This has been implemented as a heuristic rule that combines closely lying blobs (cf. the two deflections in the plot of Fig. 10).The separation of the blobs is caused by the dark field illumination and the reflection on the cylindrical surface of the GRIN lens.

Height measurement through variation of focus
Information on the stationary camera can only be used to obtain two-dimensional information about the position and orientation of a part.For a fully automated process, the height information has to be detected as well.
Due to the fact that the focus plane of the mobile camera is in a fixed and known distance to the lens, focus measurements can be utilized to determine the z coordinate of an investigated surface in comparison to an autofocus feature of a camera.Therefore, the positioner is moved in small steps towards the surface of the Gel-Pak.At each step the camera image is analyzed.In Nayar and Nakagawa (1990) and Firestone et al. (1991), different algorithms are presented to quantify the focus quality.The presented results are based on the Laplace operator (sum of second partial derivatives).The focus of an image correlates with the smoothness of edges in the image, which can be extracted with a Laplace filter.To weaken the effect of noise, the Laplacian of Gaussian filter is applied to the investigated image region.The focus is then quantified by the weighted average of the obtained pixel intensity.In Fig. 11 normalized focus measurements are plotted against the z coordinate of the robot.The focus plane is determined by the absolute maximum, which can be numerically calculated.
Autofocus algorithms must have a reliable and early abort criterion because the focus plane is tuned to be aligned with an attached gripping tool.The assigned micromanipulator allows pulling up the gripper 2 mm, which is generally enough vertical space for an automated positioning of the positioner.During the course of this work, no robust autofocus could be implemented.Only well-structured surfaces have led to acceptable results.Therefore, different approaches such as stereovision might be used in the future although this might increase the costs of the assembly solution.

Evaluation of results
In the following, results regarding the camera calibration as well as the part localization processes will be presented.In the case of camera calibration, two measurements of 40 repetitions have been carried out.Between measurements the mobile camera has been unmounted from the mechanical interface and remounted again.Results are summarized in Table 1.
At a 6σ level, the scaling factor error accounts for an absolute error of less than 0.02 %.The 6σ level for the orientation was identified at 0.2112 degrees, which is a sufficient value for micro-optical part pickup.The 6σ level of the X-Y offset is below an error of 20 µm.According to these first results, calibration is sufficiently precise for the task of micro-optical part pickup.For more reliable results, more changeover scenarios need to be carried out.The quality of calibration is strongly determined by the repeatability of the positioning system.Calibration of the mobile camera should be carried out after each camera changeover.
Using a single calibration configuration of the stationary camera, part localization has been carried out.The localization of an individual GRIN lens for 40 times has led to 6σ levels of 1.5 µm, 2.2 µm for the X-Y offset.
Detected parts are grouped and colored for convenience as shown in Fig. 12.After calibrating the local coordinate system, the positioner is moved above each detected component.The image from the mobile camera is then processed in order to evaluate the achieved precision.The component center is therefore detected in analogy to the algorithm of the stationary camera and then compared to the image center.Exemplary results are given in Fig. 13.Positioning errors in X and Y direction are obvious.The results show that part of the error seems to be systematic depending on the corner of the part magazine that was approached by the robot.All of the error offsets are within a range of 70 µm (most of them even within a range of 30 µm).One explanation for this behavior is that the robot used was in prototype stadium during the work and that kinematic transformations on the robot controller do not precisely correspond to the actual kinematic structure.The repeatability of the robot is sufficient for the pickup process.
The implemented image-based part localization and identification allows for a reliable pickup process for the investigated optical components.Currently, height information for each component type is provided by the operator or an underlying geometrical model.This ensures collision avoidance since the presented measurement via focus determination needs further investigation.The work has shown that a dependency exists between the surface structure of the optical component and the quality of the height measurement.For well-structured surfaces, the Laplacian approach leads to acceptable results.Also, the Tenengrad algorithm as mentioned in Nayar and Nakagawa (1991) performed well.Tenengrad is based on two Sobel operators calculating gradients in horizontal and vertical directions.A 6σ level of 28.2 µm (Laplace) and 18 µm (Tenengrad) has been achieved in individual cases.Figure 14 shows a single measurement run of the autofocus algorithm for a well-structured surface.For transparent parts or parts with large homogenous surfaces showing no structures, no reliable results have been achieved yet.Table 2 presents the results for a non-transparent and well-structured heating element.Smaller step sizes (e.g., of 0.1 mm) led to worse results for both algorithms.

Summary and outlook
The paper presented a concept of chaining process steps where each step transforms the assembly state to the next more granular level and named it "chain of refined perception".This concept was motivated and conceptually embedded in the context of self-optimizing micro-optical assembly systems.Such systems strongly utilize model-based control architectures, which need continuous matching with measurement data.Model-based control is an enabler for automated planning and optimization algorithms such as path planning.Sensor integration is still required to meet the precision requirements.
In more detail, techniques necessary for a computervision-based feeding of optical components have been presented and evaluated.Implemented in manual laboratory processes, this allows for a convenient way to support operators.In an automated and self-optimizing scenario, it completes the chain of refined perception.A reliable calibration routine has been presented for identifying camera parameters such as perspective distortion and for determining the scaling factor between image pixels and real-world metrics.Another routine was depicted for calibrating a local coordinate system with respect to the positioner base coordinates.Such calibration allows for picking up randomly aligned optical components.This approach was enhanced by a strategy for identifying the z coordinate of a plane through a sequence of images collected by the mobile camera attached to the tool center point.Results of this work were presented by depict-  The "chain and refined perception" will be established as an approach in further research activities focusing on the efficient planning and commissioning of flexible micro-optical assembly systems.Future work aims for a product-centric approach by establishing a formalized product description similar to the descriptions in Whitney (2004) for mechanical assemblies and by deriving the assembly execution logic automatically leading to drastically reduced planning and commissioning efforts.

Figure 1 .
Figure 1.Reduced ontology of a self-optimizing assembly system.

Figure 2 .
Figure 2. Chain of refined perception for self-optimizing assembly of micro-optical systems indicating roughly the volume or area covered by measurements as well as the measurement accuracy achieved.

Figure 3 .
Figure 3. Model of the position and orientation of the stationary and mobile camera systems (left) and photography of the part magazine setup including dark field illumination using a ring light and the micromanipulator (right).

Figure 4 .Figure 5 .
Figure 4. Relationship between the image and TCP coordinate system.

Figure 6 .
Figure 6.Images before and after a movement have been overlaid.The camera orientation is calculated by comparing the robot's movement and the vector described by the reference mark.

Figure 7 .
Figure 7.A reference mark describes a circle while the camera system is rotated.The pivot point identifies the z axes of the TCP.

Figure 8 .
Figure 8. Hybrid assembly setup consisting of a SCARA kinematic as macro-positioner and a micromanipulator for fine alignment.The software window shows the result of image processing (detection of reference marks and localization of optics of different types).

Figure 9 .
Figure 9. Setup for detecting and localizing randomly positioned optical components on a Gel-Pak ® vacuum release tray using dark field illumination (ring light).

Figure 10 .
Figure10.The center image shows the enhanced contrast achieved through dark field lighting.The right diagram presents a corresponding normalized intensity profile.The left drawing explains the separation of blobs using dark field illumination on a cylindrical lens (only one direction of illumination is illustrated): only the rays hitting a specific region of the cylindrical surface will be reflected into the camera.This phenomenon occurs on both sides of the GRIN lens so that there are two separated blobs in image processing.

Figure 11 .
Figure 11.The left model illustrates the proceeding of a focus measurement.On the right side normalized focus measurements are plotted against the z coordinate of the robot.Measurements start above the focus plane, so the measurement points were taken from right to left.

Figure 12 .
Figure 12.Image data of the stationary camera are shown.Three regions (ROI, red) are used to isolate details of interest.Identified parts (a-d) are grouped and colored by their type (GRIN, HR).

Figure 13 .
Figure 13.The identified parts of Fig. 12 have each been approached by the robot in a way that the image center of the mobile camera is overlaid with the center point of each part.

Figure 14 .
Figure 14.The plot shows a single measurement run determining the focus number calculated by the Tenengrad algorithm at a 0.2 mm step size.For statistical analyses, the z coordinate of the surface has been determined 40 times by an autofocus algorithm.
ing the achieved positions in comparison with the ideal target positions.

Table 1 .
Results of camera calibration.

Table 2 .
Results of focus measurements.