CATS Proceedings Printout - Graz University of Technology

th 

The 15 International IGTE Symposium 

on Numerical Field Calculation in Electrical Engineering 

Institute for Fundamentals and Theory 

in Electrical Engineering - IGTE 

Proceedings 

Sept. 17 - 19, 2012 

Hotel Novapark, Graz, Austria 

ISBN: 978-3-85125-258-3 

Verlag der Technischen Universität Graz 

www.ub.tugraz.at/Verlag 

Graz University 

of Technology

The 15th International IGTE Symposium on Numerical Field Calculation in Electrical 

Engineering is sponsored and supported by:

. 

- I - 15th IGTE Symposium 2012 

Table of Contents 

Multi Domain Multi Scale Problems in the High Frequency Finite 1 

Element Method (FEM) 

Istvan Bardi, Kezhong Zhao, Rickard Petersson, John Silvestro, Nancy Lambert 

A parallel-TLM algorithm. Modelling the Earth-ionosphere waveguide 7 

Sergio Toledo-Redondo, Alfonso Salinas, Jesús Fornieles, Jorge Portí, 

Bruno Besser, Herbert I. M. Lichtenegger 

A Novel Parametric Model Order Reduction Approach with Applications 13 

to Geometrically Parameterized Microwave Devices 

Stefan Burgard, Ortwin Farle, Romanus Dyczij-Edlinger 

Efficient Finite-Element Computation of Far-Fields of Phased Arrays by 19 

Order Reduction 

Alexander Sommer, Ortwin Farle, Romanus Dyczij-Edlinger 

Nanoparticle device for biomedical and optoelectronics applications 25 

Renato Iovine, Luigi La Spada, Lucio Vegni 

Validation of measurements with conjugate heat transfer models 31 

Maximilian Schrittwieser, Oszkár Bíró, Ernst Farnleitner, Gebhard Kastner 

Computing the shielding effectiveness of waveguides using FE-mesh 37 

truncation by surface operator implementation 

Christian Tuerk, Werner Renhart, Christian Magele 

Heat Transfer Analysis on End Windings of a Hydro Generator using a 41 

Stator-Slot-Sector Model 

Stephan Klomberg, Ernst Farnleitner, Gebhard Kastner, Oszkár Bíró 

Numerical Investigation of Linear Systems Obtained by Extended 47 

Element-Free Galerkin Method 

Taku Itoh, Soichiro Ikuno, Atsushi Kamitani 

Electromagnetic Wave Propagation Simulation in Corrugated 53 

Waveguide using Meshless Time Domain Method 

Soichiro Ikuno, Yoshihisa Fujita, Taku Itoh, Susumu Nakata, Atsushi Kamitani 

Optimization of Permanent Magnet Linear Actuator for Braille Screen 59 

Ivan Yatchev, Iosko Balabozov, Krastio Hinov, Vultchan Gueorgiev, 

Dimitar Karastoyanov 

3D Finite Element Analysis of Induction Heating System for High 63 

Frequency Welding 

Ilona Iatcheva, Georgi Gigov, Georgi Kunov, Rumena Stancheva 

Optimization Algorithms in the View of State Space Concepts 67 

Markus Neumayer, Daniel Watzenig, Gerald Steiner, Bernhard Brandstätter 

Quasi TEM Analysis of 2D Symmetrically Coupled Strip Lines with Finite 73 

Grounded Plane using HBEM 

Saša Ilić, Mirjana Perić, Slavoljub Aleksić, Nebojsa Raicevic

- II - 15th IGTE Symposium 2012 

Design Approach for a Line-Start Internal Permanent Magnet 78 

Synchronous Motor 

Vera Elistratova, Michel Hecquet, Pascal Brochet, Darius Vizireanu, 

Maxime Dessoude 

Speed-up of Nonlinear Electromagnetic Field Analysis using Fixed-Point 84 

Method 

Norio Takahashi, Kouske Shimoyama, Daisuke Miyagi, Hiroyuki Kaimori 

Software agent based domain decomposition method 89 

Matthias Jüttner, André Buchau, Michael Rauscher, Wolfgang M. Rucker, 

Peter Göhner 

Stochastic Jiles-Atherton model accounting for soft magnetic material 95 

variability 

Rindra Ramarotafika, Abdelkader Benabou, Stéphane Clénet 

Human exposure to the magnetic field produced by MFDC spot welding 101 

systems 

Davide Bavastro, Aldo Canova, Luca Giaccone, Michele Manca, Marco Simioli 

A Circuital Approach for Eddy Currents Fast Evaluation in Beam-like 108 

Structures 

Alessandro Formisano, Raffaele Martone 

Convergence Characteristics of Preconditioned MRTR Method with 113 

Eisenstat’s Technique in Real Symmetric Sparse Matrix 

Yoshifumi Okamoto, Tomonori Tsuburaya, Koji Fujiwara, Shuji Sato 

High Frequency Mixing Rule Based Effective Medium Theory of 119 

Metamaterials 

Zsolt Szabo 

Enhancement of Maximum Starting Torque and Efficiency in Permanent 125 

Magnet Synchronous Motors 

Jawad Faiz, Vahid Ghorbanian, Bashir Mahdi Ebrahimi 

Core Losses Estimation Techniques in Electrical Machines with 131 

Different Supplies-A Review 

Jawad Faiz, Amir Masoud Takbash, Bashir Mahdi Ebrahimi 

Fast Computation of Inductances and Capacitances of High Voltage 137 

Power Transformer Windings 

Tomislav Župan, Željko Štih, Bojan Trkulja 

Numerical and Experimental Investigations of the Structural 144 

Characteristics of Stator Core Stacks 

Mathias Mair, Bernhard Weilharter, Siegfried Rainer, Katrin Ellermann, 

Oszkár Bíró 

Proper Location of the Regulating Coil in Transformers from 154 

Short-circuit Forces Point of View 

Oluş Sonmez, Bilal Düzgün, Güven Kömürgöz 

Robust Design of IPM motors using Co-Evolutionary Algorithms 160 

Min Li, Andre Ruela, Frederico Guimaraes, Jaime Ramirez, David Lowther

- III - 15th IGTE Symposium 2012 

Free-form optimization for magnetic design 167 

Zoran Andjelic, Salih Sadovic 

Optimization for ECT treatment planning 171 

Paolo Di Barba, Luca Giovanni Campana, Fabrizio Dughiero, Carlo Riccardo Rossi, 

Elisabetta Sieni 

Investigation of the Electroporation Effect in a Singel Cell 175 

Jaime Ramirez, William Figueiredo, Joao Francisco Vale, Isabela Metzker, 

Rafael Santos, Elizabeth Silva, David Lowther 

Anisotropic Model for the Numerical Computation of Magnetostriction 181 

in Steel Sheets 

Manfred Kaltenbacher, Adrian Volk, Michael Ertl 

Analytic Approximation Solution for the Schwarz-Christoffel Parameter 186 

Problem 

Norbert Eidenberger, Bernhard G. Zagar 

Additional Eddy Current Losses in Induction Machines Due to 190 

Interlaminar Short Circuits 

Paul Handgruber, Andrej Stermecki, Oszkár Bíró, Georg Ofner 

Evaluating the influence of manufacturing tolerances in permanent 198 

magnet synchronous machines 

Isabel Coenen, Thomas Herold, Christelle Piantsop Mboo, Kay Hameyer 

Eddy current analysis of a PWM controlled induction machine 204 

Hai Van Jorks, Erion Gjonaj, Thomas Weiland 

Computation of end-winding inductances of rotating electrical 208 

machinery through three-dimensional magnetostatic integral FEM formulation 

Flavio Calvano, Giorgio Dal Mut, Fabrizio Ferraioli, Alessandro Formisano, 

Fabrizio Marignetti, Raffaele Martone, Guglielmo Rubinacci, 

Antonello Tamburrino, Salvatore Ventre 

Magnetomechanical Coupled FE Simulations of Rotating Electrical 214 

Machines 

Anouar Belahcen, Katarzyna Fonteyn, Reijo Kouhia, Paavo Rasilo, Antero Arkkio 

Saturable Model of Squirrel-cage Induction Motors under Stator 220 

Inter-turn Fault 

Jawad Faiz, Mansour Ojaghi, Mahdi Sabouri 

Accurate Magnetostatic Simulation of Step-Lap Joints in Transformer 226 

Cores Using Anisotropic Higher Order FEM 

Andreas Hauck, Michael Ertl, Joachim Schöberl, Manfred Kaltenbacher 

Finite Element Based Modeling of Wound Rotor Induction Machines 232 

Martin Mohr, Oszkár Bíró, Andrej Stermecki, Franz Diwoky 

Post Insulator Optimization Based on Dynamic Population Size 238 

Peter Kitak, Arnel Glotic, Igor Ticar 

Simulation of the Absorbing Clamp Method for Optimizing the 242 

Shielding of Power Cables 

Szabolcs Gyimóthy, József Pávó, Péter Kiss, Tomoaki Toratani, Ryuichi Katsumi, 

Gábor Varga

- IV - 15th IGTE Symposium 2012 

A Neural Network Approach to Sizing an Electrical Machine 248 

Steven Bielby, David Lowther 

Exploring and Exploiting Parallelism in the Finite Element Method on 254 

Multi-core Processors: an Overview 

Hussein Moghnieh, David Lowther 

Diagnosis of real cracks from the three spatial components of the eddy 262 

current testing signals 

Milan Smetana, Ladislav Janousek, Mihai Rebican, Tatiana Strapacova, 

Anton Duca, Gabriel Preda 

Adaptive Galaxy-Based Search Approach Applied to Loney’s Solenoid 267 

Benchmark Problem 

Leandro dos Santos Coelho, Teodoro Cardoso Bora, Piergiorgio Alotto 

Implementation of a 3D magnetic circuit model for automotive 271 

applications 

Ioannis Anastasiadis, Andreas Buchinger, Tobias Werth, Lukas Bellwald, 

Kurt Preis 

Robust Optimization of Passive RFID Antennas Loaded by Non-linear 276 

Circuits 

Yuta Watanabe, Hajime Igarashi 

Mixed Order Edge-based Finite Element Analysis by Means of 282 

Nonconforming Connection 

Yoshifumi Okamoto, Shuji Sato 

Topology Optimization Using Parallel Search Strategy for Magnetic 288 

Devices 

Takumi Nagano, Shogo Yasukawa, Shinji Wakao, Yoshifumi Okamoto 

Modeling of the Road Influence on the Grounding System in its Vicinity 294 

Dragan Vuckovic, Nenad Cvetkovic, Dejan Krstic, Miodrag Stojanovic 

Interaction Magnetic Force Calculation of Axial Passive Magnetic 300 

Bearing Using Magnetization Charges and Discretization Technique 

Saša Ilić, Ana Vuckovic, Slavoljub Aleksić 

Consideration of erroneous magnets in the electromagnetic field 305 

simulation 

Peter Offermann, Isabel Coenen, Kay Hameyer 

Potential of Spheroids in a Homogeneous Magnetic Field in Cartesian 310 

Coordinates 

Markus Kraiger, Bernhard Schnizer 

Application of Signal Processing Tools for Fault Diagnosis in Induction 315 

Motors-A Review 

Jawad Faiz, Amir Masoud Takbash, Bashir Mahdi Ebrahimi, Subhasis Nandi 

Experimental Calibration of Numerical Model of Thermoelastic Actuator 321 

Lukas Voracek, Vaclav Kotlan, Bohus Ulrych

- V - 15th IGTE Symposium 2012 

Scattering Calculations of Passive UHF-RFID Transponders 327 

Thomas Bauernfeind, Gergely Koczka, Kurt Preis, Oszkár Bíró 

Simulation of a high speed Reluctance Machine including hysteresis 331 

and eddy current losses 

Bernhard Schweighofer, Hannes Wegleiter, Manes Recheis, Paul Fulmek 

An Iterative Domain Decomposition Method for Solving Wave 337 

Propagation Problems 

Koczka Gergely, Thomas Bauernfeind, Kurt Preis, Oszkár Bíró 

On Effectiveness of Model Reduction for Computational 340 

Electromagnetism 

Yuki Sato, Hajime Igarashi 

Calculation of eddy-current probe signal for a volumetric defect using 346 

global series expansion 

Sandor Bilicz, József Pávó, Szabolcs Gyimóthy 

Bodies motion computation using eddy-current integral equation 352 

Mihai Maricaru, Ioan R. Ciric, Horia Gavrila, George-Marian Vasilescu, 

Florea I. Hantila 

Adaptive Inductance Computation on GPU’s 357 

Andrea Gaetano Chiariello, Alessandro Formisano, Raffaele Martone 

The reduced-basis method applied to transport equations of a 362 

lithium-ion battery 

Stefan Volkwein, Andrea Wesche 

Surrogate Parameter Optimization based on Space Mapping for 368 

Lithium-Ion Cell Models 

Matthias Scharrer, Bettina Suhr, Daniel Watzenig 

Large Scale Energy Storage with Redox Flow Batteries 374 

Piergiorgio Alotto, Massimo Guarnieri, Federico Moro, Andrea Stella 

Model Order Reduction for a Lithium-Ion Cell 380 

Bettina Suhr, Jelena Rubesa 

Automatic domain detection for a meshfree post-processing in 386 

boundary element methods 

André Buchau, Matthias Jüttner, Wolfgang M. Rucker 

Efficient modeling of coil filament losses in 2D 392 

Leena Lehti, Janne Keränen, Saku Suuriniemi, Timo Tarhasaari, Lauri Kettunen 

Optimization of Energy Storage Usage 398 

Arnel Glotic, Peter Kitak, Igor Ticar, Adnan Glotic 

Adaptive Surrogate Approach for Bayesian Inference in Inverse 403 

Problems 

Markus Neumayer, Helcio R.B. Orlande, Marcello J. Colaco, Daniel Watzenig, 

Gerald Steiner, Bernhard Brandstätter, George S. Dulikravich

- 1 - 15th IGTE Symposium 2012 

Multi-Domain Multi-Scale Problems in High 

Frequency Finite Element Methods 

Istvan Bardi, Kezhong Zhao, Rickard Petersson, John Silvestro and Nancy Lambert 

ANSYS Inc., 225 W Station Square Drive, Pittsburgh, PA 15219, U.S.A. 

E-mail: steve.bardi@ansys.com 

Abstract—This paper presents Domain Decomposition Methods to overcome the challenges posed by multi-domain, multi-scale 

high frequency problems. By decomposing large electromagnetic regions into smaller domains, the Finite Element Method can 

cope with the simulation of electrically large problems. A hybrid Finite Element and Boundary Integral procedure is also 

presented that allows for domains to employ different solution methods in different subdomains. The Robin Transmission 

Condition (RTC) is applied to link the domains and preserve field continuity on interfaces. Real life examples demonstrate the 

accuracy and efficiency of the new method. 

Index Terms—FEM, hybrid FEM and boundary element method, multi scale problems, Robin Transmission Condition 

I. INTRODUCTION 

The finite element method (FEM) is a powerful tool 

for simulating high frequency structures. There are 

several features of the method that have become expected 

elements of a successful commercial simulator. These 

elements include spurious mode free hierarchical, higher 

order vector basis functions, curvilinear elements, 

automatic/adaptive meshing, transfinite elements, mesh 

truncation methods, broad band frequency sweeps, 

parameterization and preconditioned iterative solvers. 

However, new challenges have emerged in recent years: 

the simulation tools need to cope with multi-scale 

problems that start on the chip level, couple to the 

package and board levels, and encompass the platform 

and antenna levels. Each component in a multi-scale 

problem can require millions of unknowns to simulate. 

Chip complexity rises to billions of circuit elements; 

packages involve large numbers of ports; printed circuit 

boards (PCBs) often contain thousands of traces on many 

layers; and platforms and antennas often involve 

dimensions of hundreds of wavelengths. 

While recent advances in High Performance 

Computing (HPC) hardware greatly accelerate numerical 

computations, new algorithms are needed to exploit the 

new HPC environment. In particular, efficient and 

effective physics-based parallelization is required to 

address the challenges of multi-scale and multi-domain 

simulation. This paper presents an overview of domain 

decomposition methods that exploits the physical nature 

of multi-scale, multi-domain problems to tackle 

heretofore impossibly complex high frequency problems. 

II. BASICS OF DOMAIN DECOMPOSITION METHOD 

A basic characteristic of HPC is the use of multiple 

processors to perform computations in parallel. An 

algebraic approach to using HPC is to partition large 

matrices into smaller sub-matrices. In some cases, this is 

inefficient even when iterative solvers are employed. 

Physics-based domain decomposition is typically more 

efficient because the subdomains exploit the geometry 

and the field. In this case, both the solution domain and 

the mesh are partitioned into smaller subdomains and 

meshes. The mesh of the sub-domains can be 

overlapping, conformal touching, non-conformal 

touching or even non-touching [1-3]. 

Another important advantage for DDM is the ability to 

link differing solution methods and physics. For instance, 

the finite element method is better at simulating complex 

geometries, while the boundary element method copes 

better with electrically large but simple, smooth 

structures. This paper presents hybrid finite element– 

boundary integral (FE-BI) methods that allows nonconformal 

touching domains and disjoint regions. 

Figure 1: Incident surface electric and magnetic current 

densities impinging on an FEM domain 

A. Single FE domain with surface electric and 

magnetic current excitations 

Consider a computational domain where an incident 

field impinges on a section of a boundary as illustrated in 

Figure 1. The wave equation to be solved is 

2 

imp 

1 r 

E1 

ko1rE1 jkoJ 

1 (1) 

imp 

where J1 is the impressed current density. The total 

field description is used inside the domain, while the 

scattered field description is used outside 

sc inc 

sc inc 

E1 E1 

E1 

; H1 H1 

H1 

(2) 

For the sake of simplicity, a Dirichlet boundary condition 

is used on 1 \ 12 

 

n 1 E 0 

(3) 

, both the electric and the magnetic field jump 

On 12

inc 

inc 

with n1 E1 

and n2 H2 

, respectively [4]. 

Consequently, an absorbing boundary condition is 

sc 

sc 

required for E and H . Assume we know an operator 

that provides perfect absorption: 

sc 

sc 

n1 H1 

ABC ( n1 

n1 

E1 

) 

(4) 

Since, the total field description is used in the 

computational domain, the scattered field variables can 

be eliminated using (2). Then, the Neumann boundary 

condition for the total magnetic field is 

n1 

1r 

E1 

 

(5) 

inc inc 

jo ( ABC ( e1) 

ABC ( e1 

) J1 

/ ) 

where the J electric and the magnetic current densities e 

is introduced as 

J n 

H and e n 

n 

E 

(6) 

where is the wave impedance. Introducing the first 

order ABC generates the simple form 

inc inc 

n1 1r 

E1 

jko( e1 

e1 

J1 

) 

(7) 

Equations (5) and (7) are Robin transmission boundary 

conditions. They generalize the Neumann boundary 

condition to include the incident fields. Thus, to excite 

the computational domain by an external incident field, 

the electric and magnetic surface current densities inc 

J 

inc 

and e need to be specified. The transmission 

conditions are first order when the first order ABC is 

used and higher order when higher order ABC’s are used. 

B. Multiple FE domains with surface electric and 

magnetic current coupling 

Now consider a computational domain that is 

subdivided into two subdomains (Fig. 2). 

Figure 2: Decomposition into two non-overlapping 

subdomains 

The boundary value problem (BVP) for the first domain 

is 

2 

imp 

1 r 

E1 

ko1rE1 jkoJ 

1 in 1 (8) 

inc inc 

n1 1r 

E1 

jkoJ1 jko( 

e1 

e1 

J1 

) on 12 

(9) 

and similarly for the second domain 

2 

imp 

 

2r 

E2 ko 2rE2 

jkoJ 

2 in 2 (10) 

inc inc 

n2 2r 

E2 

jkoJ 2 jko( 

e2 

e2 

J2 

) on 12 

(11) 

The incident field for the first domain is the field of the 

second domain and vice versa 

inc 

e1 e2 

; e2 e1 

inc (12) 

inc 

J J 

J J 

inc 

 

(13) 

1 

2 ; 2 1 


Applying this to Equations (9) and (11), we get 

n1 1r 

E1 

jkoJ1 

jko( 

e1 

e2 

J2 

) (14) 

n2 2r 

E2 

jkoJ 

2 jko( 

e2 

e1 

J1) 

(15) 

The right hand sides of Eqs. (14) and (15) are the 

Neumann Boundary conditions for domain 1 and 2 , 

respectively. They will be included into the finite element 

formulation as natural boundary conditions. Since J1 and 

J 2 were introduced, Eqs. (14), (15) must be prescribed 

explicitly as well 

J1 e1 

e2 

J2 

(16) 

J2 e2 

e1 

J1 

(17) 

It can be proved ([1]) that solution of the differential 

equations (8) and (10) are unique applying natural and 

essential interface conditions (14), (15) and (16), 17), 

respectively. Applying Galerkin’s method, the bilinear 

form and the essential boundary condition for domain 1 

is the following: 

b( 

, E ) jk v , e jk v , e jk v , J 

v1 1 o 1 1 o 1 2 

12 

o 1 2 

12 

12 

imp v1J11 jk 

(18) 

o 

jk o( 

w1, e1 

 

12 

w1, 

J1 

 

12 

w1, 

e2 

 

12 

w1, 

J2 

) 0 

12 

(19) 

The same applies to 2 . Note, that the testing functions 

w should be orthogonal to those of v. Discretizing the 

scalar products, yields the matrix equation [1] 

K1 

 

 

G 21 

where 

G12 

u1 

y1 

 

 

 

K 

 

2 u 

2 

y1 

 

(20) 

Ek 

bk 

 

u 

 

k 

ek 

 

; y 

 

k 

 

0 

 

; k=1,2 

 

J k 

0 

(21) 

A 

k 

T 

K k 

 

Ck 

 

0 

Ck 

vv 

Bk 

jkoTkk 

wv 

jkoTkk 

0 

0 

 

 

ww 

jk oTkk 

 

(22) 

0 

G 

 

12 G 21 

 

0 

 

0 

0 

vv 

jkoT12 

wv 

jkoT12 

0 

vw 

jk 

 

oT12 

 

ww 

jk oT12 

 

(23) 

, v n v n 

vv 

; 

ww 

n 

w , n 

w 

j 

12 

Tij i Tij i j 

12 

(24) 

vw 

Tij n 

vi 

, n 

w j 

 

(25) 

12 

Matrices A, B, C and b are the same as in the case of 

standard FE discretization and can be found in [1] along 

with the definitions of the scalar products. The structure 

of Eq. (18) shows, that the variables of the FE domains 

are coupled just via the surface electric and magnetic 

current variables, which are called cement variables. 

C. Hybrid FE - BI domains with surface electric and 

magnetic current coupling 

Fig. 3 shows two separated domains 1 and 2 . The 

fields in these domains are coupled via the free space 

domain ext . The FEM is used in 1 and 2 , while 

Boundary Integral Method (BI) is used in ext .

Figure 3: Decomposition into two FEM and one BI 

subdomains 

The boundary value problem for the finite element 

domains is similar to that in section B 

2 

imp 

 

ir 

Ei ko irEi 

jkoJ 

i in i (26) 

 

ni 1i 

i o i o i i i 

 

 

 

 

E jk J jk ( e e J ) on i (27) 

 

Ji ei 

ei 

Ji 

on i (28) 

Eqs. (27) and (28) are the Neumann and the Robin 

transmission boundary conditions, respectively. 

For the unbounded subdomain ext , the boundary 

integral equation representation is used, based on 

Stratton-Chu [2]. The boundary integral equation for the 

electric and the magnetic current densities are 

1 

2 

inc 

 

ei ei 

{ nk 

( C( 

nk 

ek 

)) jk onk 

( A( 

Jk 

)) 

2 

k1 

1 

 

( jk o) 

 

( 

Jk 

)} on i (29) 

jk 

2 

o inc 

 

J i Ji 

{ jkonk 

nk 

( C( 

Jk 

)) 

2 

k1 

2 

 

 

jkonk nk ( 

A( 

nk 

ek 

)) 

( 

nk 

ek 

)} on i (30) 

and the Robin transmission boundary condition is: 

 

 

Ji ei 

ei 

Ji 

on i (31) 

where 

' 

' 

' 

' ' 

A ( x) 

xgds ; ( 

x ) ( x) 

gds ; C (x) 

x 

gds 

 

 

 

(32) 

Applying Galerkin’s method again, the matrix equation to 

be solved is 

K 

1 N12x 

1 

y1 

 

(33) 

T 

N12 

K 2 x 

2 

y1 

where 

AII 

A 0 0 

0 

I 

 

 

 

 

A A T D T D 

I 

 

 

T 

T 

K 0 D T D T 

i 

 

 

 

 

 

T 

T 

 

0 T D Q T P 

 

ii 

ii 

 

T 

T 

T 

 

0 D T P Q T 

 

 

ii 

ii 

I 

0 

0 0 0 0 Ei 

y i 

 

 

0 0 0 0 0 e 

 

i 0 

; 

 

N12 

0 

0 0 0 0 x i J 

; 

i y (34) 

i 0 

 

inc 

E 

0 

0 0 Q12 

P12 

ei 

y 

i 

T 

 

inc 

 

0 

0 0 P 

12 Q12 

 

J 

i 

H 

y 

i 

Further details are provided in [1]. 

This general hybrid finite element-boundary integral 

equation method (hybrid FE-BI) is very flexible. The 

subdomains can be FEM, BEM or any other numerical 

method. If just one FEM subdomain exists, it provides a 

perfect absorbing boundary condition (FE-BI). 


III. SOLVING THE MATRIX EQUATIONS 

In this section, the solution of the matrix equations is 

presented via a stationary alternating Schwartz algorithm 

based on Jacobi Splitting. The idea is to eliminate the 

internal variables and solve for the surface current 

densities also called cement variables. Performing this 

process iteratively is called domain iteration. Partitioning 

the variables accordingly 

e 

k Ek 

 

c k ; u k 

J 

; (35) 

k c 

k 

Eq. (20) for the k-th domain is 

Ek 

bk 

0 

Kk 

c j 

c 

 

g 

 

k 0 

 

kj 

(36) 

g kj can be read from Eq.(23) and k and j are the domain 

indices. Note, that the internal variables are expressed by 

the cement variables. Supposing, the inverse matrix of the 

k-th domain is known and also partitioned to internal and 

cement variable blocks, we get: 

E E 

E c 

k Ek 

k k 

P b P 

g 

k k k k kj 

c 

j 

c 

 

(37) 

c 

c c 

k E 

k k 

k 

P b 

 

Pk 

g 

k k 

kj 

where 

Ek 

Ek 

Ek 

ck 

P 

 

k Pk 

K k c 

k Ek 

ck 

ck 

Pk 

Pk 

 

1 

(38) 

These equations allow domain iterations to be applied 

k E P b P g c 

n1 

k 

E Ek 

k 

c E 

k 

k 

Ek 

ck 

k 

n 

kj j 

(39) 

n1 

k k ck 

ck 

n 

ck P bk 

Pk 

gkjc 

j 

(40) 

where the superscript n provides the iteration number. 

cc 

The matrix P k is called the numerical Green’s function 

and quantities c k and c j in Eq. (40) are called the 

cement variables. The domain iteration works with the 

cement variables only, but it needs blocks of the inverse 

of the system matrix of the internal variables. Eq. (39) 

provides the update for the internal variables, which are 

not included in the domain iteration because they do not 

needed to be updated unless the right-hand-side changes. 

Other, popular methods also can be applied, such as 

GMRES, a Krylov Subspace Method. The domain 

iteration needs to invert the subdomain matrices in each 

iteration. For this purpose, either a multifrontal direction 

solver can be used or a p-type multiplicative Schwarz 

preconditioner (pMUS) iterative solver. The iteration 

matrix 

cc 

Akj Pk 

(41) 

is dense but it can be replaced by a sparse matrix using 

Adaptive Cross Approximation (ACA) 

~ mn 

kj 

mn 

mr 

rn 

Akj 

A Ukj 

Vkj 

(42) 

where m and n are the row and column numbers and r is 

the rank of the matrix. 

The domain iteration method presented above was 

derived for the case when the solution domain is 

partitioned into two sub-domains with one coupling 

surface interface. If the solution domain is partitioned 

into multiple domains with multiple coupling surfaces, 

the number of the cement variables increases but the

essence remains the same: the subdomain variables are 

expressed in terms of cement variables and the domain 

iteration is set up for the cement variables. The same 

applies when the subdomains are coupled via BI domains. 

The convergence of the domain iteration depends on 

the order of the Robin transmission boundary conditions. 

For simplicity, first order Robin boundary conditions 

were used in the above derivations. Higher order 

conditions are also available in [5]. Higher order 

transmission condition enforce the requirement that the 

eigenvalues of the system matrix be inside the unit circle. 

This is a necessary condition for a good domain iteration 

convergence. A second order transmission boundary 

condition can be realized as in Eqs. (16) and (17) 

J j Aje j Bj 

 

e 

j Jk 

Ake 

k Bk 

 

ek 

(43) 

J A e B 

e 

J 

A e B 

e 

k k k k k j j j j 

(44) 

where k and j are the indices of the neighboring domains 

and denotes the surface operator. Constants k A , k B , Aj 

and B j can be optimized for convergence. Figure 4 

shows the improvement in convergence provided by the 

second order Robin Transmission Condition (RTC). 

Figure 4: Convergence with first and second order RTC 

III. REPETITIVE STRUCTURES 

If identical substructures exist in the computational 

domain, the computational effort of storing and solving 

the equations is dramatically reduced. Repeated 

identical substructures are called unit cells and have 

the same mesh. Only one unit cell is stored physically 

in the computer; the other unit cells are virtual. The 

physically-stored unit cell is called the parent, while 

the virtual ones are called children. A structure can 

have multiple parents. In the case of non-conformal 

domain decomposition, no constraints are applied to 

the mesh. In the case of conformal DDM, the parent 

mesh must be constrained so that it matches with the 

surface meshes of the children. 

For the sake of simplicity, assume that the entire 

computational domain consists of just one repeated 

structure. This is usually the case when finite antenna 

arrays are simulated. Figure 5 shows a single parent 

case with an internal block and matrices of repetitive 

unit cells. Here there is one system matrix and two 

coupling matrices. Thus, only three of the sixteen 


j 

matrix blocks need to be stored and matrix block A 

must be factored once instead of 4 times. 

Figure 5: Internal blocks and corresponding matrices 

of repetitive unit cells 

IV. MULTI DOMAIN DDM WITH FE-BI 

As it has been shown, DDM is based on a divide-andconquer 

philosophy. Instead of tackling a large and 

complex problem directly as a whole, the original 

problem is partitioned into smaller, possibly repetitive, 

and easier to solve sub-domains. In this paper, DDM is 

used as an effective FEM preconditioner, where a higher 

order Robin’s transmission condition (RTC) is devised to 

enforce the continuity of electromagnetic fields between 

adjacent sub-domains and accelerates the convergence of 

the iterative process. DDM is also employed to provide a 

hybrid FEM-BEM approach where the treatment of the 

radiation condition is exact. The hybrid finite elementboundary 

integral (FE-BI) method allows FEM-domains 

to be disconnected with the coupling between disjoint 

domains provided via Green’s functions. The advantages 

of DDM-based FE-BI compared to traditional FE-BI 

include modularity of FEM and BI domains in terms of 

mesh and basis functions. This “non-conformal” ability 

significantly simplifies the integration of existing stateof-art 

FEM and BEM solvers. The continuity 

enforcement through Robin’s RTC naturally renders 

present FE-BI free of internal resonance issue. Since 

domains are allowed to be disjoint, if one or more subdomains 

are purely metallic or highly conducting, DDM 

can allow the integral equation method to be applied to 

these sub-domains directly to reduce memory 

consumption. 

V. APPLICATIONS 

To illustrate the effectiveness and accuracy of DDM, 

an array of tapered slot antennas is considered. The 

antenna element is of the Vivaldi type. The antenna is 

similar to the one described in [8]. The rectangular array 

spacing is 34 mm along y and 36mm along x. The εr = 6 

substrate is 0.02 λ0 thick and the height and opening size 

of the slot is ≈0.5 and 0.45λ0 respectively. To show the 

accuracy of the simulation an array of 81 elements (9x9) 

was analyzed using DDM. For comparison, a full array 

model of the 81 elements with a slightly different edge 

treatment was also created and simulated using FEM in a 

single domain. The model simulated without using DDM 

will be referred to as the explicit model. The two patterns 

for the φ=0° cut (perpendicular to the slot faces) for all 

elements excited with equal amplitude and 0° phase shift 

are shown in Figure 6. Excellent agreement is obtained. 

In addition to being able to compute the field patterns, the 

full scattering matrix can also be extracted from the DDM 

simulation. To verify the accuracy of this computation, 

consider the data shown in Figure 7. This plot compares

the refection coefficient of the center element (element 

#41) in the array and also the coupling terms (S41,-- dB) 

for the coupling between the center element and the next 

4 elements along the same row of slots. Again agreement 

between the two sets of data is excellent. 

Another infinite array simulation was performed using 

linked boundary conditions and the active element pattern 

was computed [9]. The active element pattern is the 

radiation pattern for an infinite array of elements where 

only a single element is excited. Finite arrays of 9x9 and 

21x21 elements were simulated using DDM. The 

radiation pattern with only the center element excited was 

computed for each of these arrays. For comparison a 

single antenna element on a finite ground plane was also 

analyzed. The normalized φ=90° patterns for these 4 

antennas is shown in Figure 8. The agreement with the 

infinite array active element pattern improves as the array 

size increases from 1 to 9x9 to 21x21 elements. This 

demonstrates the accuracy of the DDM simulation 

procedure for large arrays. As a final test, a 15x15 array 

of Vivaldi elements was simulated using DDM. In this 

case, the radiation pattern for 0° scan angle was 

calculated. The 3D polar of this pattern is shown in 

Figure 9. 

All simulations were run on a Linux cluster. Each 

machine in the cluster had 12 CPUs and 96 GB memory. 

The explicit 81 element model was run on a single 

machine. For the DDM simulation, the domain 

simulations were distributed over several machines and 

CPUs. For the 9x9 array, 62 domains were used and 

21GB Ram was required; for the 15x15 array, 68 

domains were used and the total memory required was 

≈28GB. The latter simulation shows the power of this 

approach – even though the number of tetrahedra 

increased significantly, the memory usage was still less 

than 30GB. 

Figure 6: Comparison of the φ=0° patterns for all 

elements excited with equal phase and magnitude for 9x9 

array. The DDM data is the solid black line and the 

explicit model data is the dashed red line. 

Table I shows a comparison of solver statistics of 

different element sizes and methods. 

TABLE I 

COMPARISON OF MEMORY AND SOLUTION TIME OF 

DIFFEREN METHODS/ARRAYS 

Time Number 

of tets 

Memory 

Explicit (9 x 9) 190 min 1.7 m 50 GB 

DDM (9 x 9) 90 min 1.6 m 21 GB 

DDM (15 x 15) 300 min 4.3 m 28 GB 


Figure 7. S41,-where element 41 is the center element of 

the array and elements 42-45 are the remaining elements 

along the middle row computed using two different 

approaches. 

Figure 8: Phi =90 °element patterns calculated using the 

infinite array approximation (Element_pattern) and from 

a 9x9 and 21x21 element array compared to the pattern 

for a single isolated element (iso). 

Figure 9: 3D polar plot of the radiation pattern for the 

15x15 array where all elements are excited with equal 

amplitude and phase 

The next example demonstrates the efficiency of FE- 

BI. Figure 10 shows an Apache helicopter with a 

conformal FE-BI surface and it has been simulated at 900 

MHz. Table II shows a comparison with pure FEM and 

IE methods. The results show the superiority of FE-BI, 

due to its conformal mesh truncation capability.

Figure 10: Apache helicopter with conformal FE-BI 

boundary 

TABLE II 

COMPARISON OF MEMORY AND SOLUTION TIME OF 

DIFFEREN METHODS 

Number 

of cores 

Memory Time 

FEM (PML box) 12 300 GB 330 min 

IE 12 83 GB 328 min 

FE-BI 

(conformal) 

12 21 GB 63 min 

VI. CONCLUSION 

The proliferation of High Performance Computing 

(HPC) has made parallelization a basic requirement for 

simulation codes today. Computational tasks can be 

distributed on different machines (nodes) or cores 

(distributed or shared memory). DDM is an ideal 

procedure for achieving high HPC efficiency. The 

subdomain solutions are fully independent of each other, 

so they can be evaluated in parallel, either by using 

distributed or shared memory. Subdomain solvers also 

exploit multi-processing and iterative solution methods. 

Both the Schwartz or Krylov domain iteration methods 

distribute tasks with high parallelism. The standard 

Message Passing Interface (MPI) can be used to control 

the data exchange between the nodes and cores. As 

demonstrated in the examples, the hybridized FE and BI 

DDM procedure provides a flexible and efficient tool to 

solve multi scale multi domain problems. 


[1] 

REFERENCES 

K. Zhao, V. Rawat, S. Lee and J.F Lee, "A Domain Decomposition 

Method with Nonconformal Meshes for Finite Periodic and 

Semi-Periodic Structures," IEEE Transactions on Antennas and 

Propagation, vol. 55, pp. 2559 - 2570, September, 2007. 

[2] K. Zhao, V. Rawat and J.F Lee, "A Domain Decomposition 

Method for Electromagnetic Radiation and Scattering Analysis of 

Multi-Target Problems," IEEE Transactions on Antennas and 

Propagation, vol. 56, pp. 2211 - 2221, August 2008. 

[3] I. Bardi, Zs. Badics and Z. Cendes, "Total and Scattered Field 

Formulations in the Transfinite Element Method," IEEE 

Transactions on Magnetics, vol. 44, pp. 778-781, June, 2008. 

[4] R. F. Harrington, Time–Harmonic Electromagnetic Fields, John 

Wiley & Sons, Inc. New York, 2000. 

[5] Y. Shao, Z. Peng and J.F Lee, "Full-Wave Real-Life 3-D Package 

Signal Integrity Analysis Using Nonconformal Domain 

[6] 

Decomposition Method," IEEE Transactions on Nicrowave 

Theory and Techniques, vol. 59, pp. 230 - 241, February 2011. 

W. C. Chew and C.C. Lu, "The use of Huygens’ equivalence 

principle for solving the volume integral equation for scattering," 

IEEE Transactions on Antennas and Propagation, vol. 41, pp. 897 

- 904, July 1993. 

[7] Y.J Li and J.M. Jin, "A New Dual–Primal Domain Decomposition 

Approach for Finite Element Simulation of 3-D Large – Scale 

Electromagnetic Problems," IEEE Transactions on Antennas and 

Propagation, vol. 55, pp. 2803 – 2810, October 2007. 

[8] L.E. R. Petersson and J-M Jin, “Analysis of periodic structures via 

a time-domain finite-element formualiton with a Floquet abc,” 

IEEE Trans. AP, pp. 933-944, Mar. 2009. 

[9] J. Manges, J. Silvestro and R. Petersson, “Accurate and Efficient 

Extraction of Antenna Array Performance from Numerical Unit- 

Cell Data,” 2011 European Microwave Conference


Parallelization of the Transmission Line Matrix 

method. Modelling Schumann Resonances and 

Atmospherics 

S. Toledo-Redondo∗ ,A.Salinas∗ , J. Fornieles∗ ,J.Portí † ,B.Besser ‡ , and H.I.M. Lichtenegger ‡ 

∗Department of Electromagnetism and Matter Physics, University of Granada, Spain. 

† Department of Applied Physics, University of Granada, Spain. 

‡ Space Research Institute, Austrian Academy of Sciences, Graz, Austria 

E-mail: sergiotr@ugr.es 

Abstract—In this paper, a parallelization of the Transmission-Line Modelling (TLM) method is presented. It is intended to 

work efficiently regardless of the spatial topology of the problem, by transforming the initial topology into a one-dimensional 

structure. It is designed for shared memory environments, and its implementation is carried out using OpenMP directives. 

The algorithm is applied to find the first cut-off frequency of the Earth-ionosphere waveguide by solving two models of 

the real system. The performance of the algorithm for the mentioned problem is studied in terms of speedup over two 

different platforms. Relative speedups of up to 16 are achieved with the use of 32 CPUs. Finally, the whole Earth-ionosphere 

cavity is simulated, with an accuracy of 5 km grid size, leading to an error of less than 1.5% in the Schumann Resonance 

frequencies. The spatial resolution achieved also enables for the first time the possibility of using this model to study the 

global effects generated by local phenomena in the Earth-ionosphere cavity. 

Index Terms—Earth-ionosphere waveguide, Schumann resonances, shared memory, speedup, TLM. 


Numerical methods are a tool for embracing scientific 

and technological problems which are difficult or even 

impossible to solve analytically. In addition, simulation is 

often an intermediary step between design and construction 

of prototypes in industry. High performance computers 

are one of the keys of the present importance of these 

methods, because they allow simulating more and more 

complex situations as the technology evolves. However, 

the top speed of processors seems to have reached its top 

[1], and the tendency of CPU manufacturers is to ship 

multi-core processors instead of building faster single 

CPUs [2]. 

The Transmission-Line Modelling method [3] is employed 

for the simulation of electromagnetic problems 

since 1971 [4], although it can simulate other problems 

as well, such as heat or particle diffusion, acoustic propagation, 

deformation in electric solids, waves in fluids, 

etc. [5] [6]. It has been used previously, in 2D form, 

for the study of atmospheric phenomena, e.g., Schumann 

Resonances [7], which is a problem similar to the one 

that will be addressed in this paper. 

The propagation of atmospherics in the Earthionosphere 

waveguide is a complex problem which involves 

several natural media (ground, oceans, atmosphere, 

ionospheric plasma) as well as the phenomena of 

lightning [8], [9]. A parallel-TLM algorithm is employed 

to model the propagation of these natural signals and 

allows finding the first cut-off frequency under two 

different approximated models of the natural waveguide 

formed by the ground and the ionosphere. 

Programming efficient algorithms with these relatively 

new hardware solutions is not straightforward. Different 

approaches must be taken into account according to 

the kind of hardware used. For instance, the way of 

designing a parallel code for a Graphical Processing Unit 

(GPU) [10] will be different than for a multi-core system 

with shared memory access [11]. Programming shared 

memory environments is probably the most similar to 

traditional computing, i.e., not parallel, but still there is 

a great difference in the way we should conceive the 

algorithms [12]. 

In Section 2, the TLM method is briefly introduced, as 

well as the Symmetric Condensed Node (SCN). Section 

3 describes the approach employed to parallelize the 

method. In Section 4, the Earth-ionosphere waveguide is 

introduced, and it is solved by means of the proposed 

algorithm. The model is benchmarked and its performance 

in terms of speedup is presented. In Section 5, 

the model is employed to solve the whole lossless Earthionosphere 

cavity, obtaining its Schumann Resonances. 

A brief summary as well as the main conclusions of the 

paper are detailed in Section 6. 

II. THE TLM METHOD 

TLM is a numerical method intended for simulation of 

propagation problems which are governed by differential 

equations. Problems which have to deal with electromagnetics, 

heat diffusion, gravity waves, acoustic waves, 

etc. are suitable to be modelled with this technique. The

Fig. 1. Scheme of the Symmetric Condensed Node with 12 link lines. 

idea behind the method is to build a circuit based on 

transmission lines which behaves in analogous form as 

the problem we want to implement, i.e., the governing 

equations are the same for the circuit and for the physical 

problem. In this work, TLM will be used to solve 

Maxwell equations and the study of the Earth-ionosphere 

waveguide. 

The TLM method discretizes both time and space and 

sets up an iterative process in which the six components 

of the electromagnetic field evolve in time from a known 

initial situation. Therefore, the fields radiated and/or 

propagated in the space are simulated with arbitrary 

accuracy, which is constrained by the size of the space 

discretization. A thumbnail rule is that the minimum 

wavelength (λ) of interest must be ten times larger than 

the size of the cell (Δl), i.e., Δl ≤ λ/10 [13]. 

Depending on the problem we want to model, each 

independent cell will be simulated by a different set up 

of transmission lines and node. For 3D electromagnetic 

problems, the most used circuit since it was formulated is 

the Symmetrical Condensed Node (SCN) [14], together 

with its variations. In this paper, the SCN with stubs for 

conductivity [15] will be used. In Fig. 1, a scheme of the 

transmission lines arrangement is shown. With its 12 link 

lines, the node is capable of modelling the behavior of 

Maxwell’s equations for a differential of volume, ΔV . 

Regardless of the node employed for modelling each 

ΔV , the process iteration of TLM is always the same. 

For each node and time step n, t = nΔt (where Δt ≤ 

Δl/2c for a cubic SCN, being c the speed of light in the 

medium), there is a set of incident pulses or voltages Vi, 

one for each transmission line of the node. During the 

time step they travel along the line, and they are either 

reflected and or transmitted to other lines, depending on 

the node structure. The transmitted/reflected pulses, or 

simply the scattered pulses Vr, are related to Vi by the 

scattering matrix S: 

Vr = S · Vi . (1) 

At the next time step, the scattered voltages from 

each node are converted into incident voltages of the 


nearby node, thus propagating the pulses along the entire 

network. It is important to fix time synchronism in a 

manner that all pulses in the mesh are simultaneously 

incident at the center of their respective node at each 

time nΔt. 

III. PARALLELIZATION OF TLM 

TLM method is a time and memory consuming application, 

when applied to large problems. In its most 

basic form, at least 12 floating point numbers (floats) 

are needed in order to store the 12 voltages of each 

node. Six more floats become necessary if either nodes 

of variable size, permittivity (εr), or permeability (μr) of 

the medium are required. Finally, three more floats must 

be used for adding electric conductivity. The problems 

solved in this work make use of 15 different transmission 

lines per node, 12 required by the basic SCN configuration, 

plus 3 for modelling the electric conductivity. 

The operation which consumes most of the computational 

time is reflected in Eq. 1, since this matrix 

multiplication must be performed at each node for each 

time step. For the case involved in this paper, the matrix 

multiplication has been reduced to 18 floating point 

multiplications and 36 additions, at the cost of executing 

different portion of code for the nodes which are at the 

border of the spatial distribution. The number of nodes 

for the problems involved in this work are on the order of 

10 6 , and all these computations can be done concurrently 

for different nodes. 

Parallelizing the independent matrix multiplications 

of Eq. 1 requires doubling the minimum memory size 

required to hold the problem. Since the input voltage 

for one node is the output voltage for another node, 

Vi and Vr must be stored in different variables. On a 

sequential implementation of the model, i.e., not parallel, 

the order of execution is known, and the calculated Vr can 

overwrite Vi, if its value is previously stored on a local 

variable which can be erased when all related Vr have 

been calculated. Since the order in which the matrix multiplications 

will be performed is not known for a parallel 

version, doubling the minimum required memory size is 

a non-avoidable penalty of engaging parallelization. In 

the described TLM implementation, each node has the 

requirement of 30 floating-point variables (120 bytes) for 

storing the line voltages. 

The algorithm has been designed to be independent 

of its spatial topology, and it should provide the same 

performance regardless of the arrangement of the nodes 

in the space. The motivation for this constraint is to have 

a very flexible tool for solving different problems. In 

order to work in the same way regardless of the spatial 

geometry, the algorithm is split into two different steps: 

the pre-processing and the TLM computation itself. 

The pre-processing is a fast (when compared to TLM 

computation) operation which is in charge of transforming 

an arbitrary topology to a common one which can 

accommodate any kind of spatial distribution. The idea

is to assign a unique identifier to each node of the initial 

topology and to store a vector with the unique identifiers 

of the adjacent nodes. In this way, the result can be seen 

as a one-dimensional vector of nodes where each one 

knows which others are their neighbors. Any complex 

distribution of nodes can be simplified to this unified 

arrangement, regardless of the arbitrary initial geometry. 

On the other hand, abstracting the initial topology to 

this new paradigm brings a penalty of 6 integers per 

node to store the neighbor identifier at each direction 

(for 3D topologies) plus another integer to mark the kind 

of medium that the node belongs to. Therefore, the total 

amount of RAM memory required for each node is 152 

bytes. 

The TLM computation loop employed is shown in 

high-level pseudo-code below, where the OpenMP directives 

have been included: 

#pragma omp parallel private(private variables) 

{ 

for(t=0..TotalTime) 

{ 

#pragma omp single 

{ 

//The reflected pulses become the incident at new time step 

Vi = Vr; 

//system feeding 

for(i=0..NumberOfFeeds) V[i]= feeding; 

//store the relevant output 

for(i=0..NumberOfOutputs) output=V[i]; 

} 

#pragma omp for schedule (static) 

for(i=0..NumberOfNodes) Vr[i]=S*Vi[i]; 

}end for(t) 

}end pragma parallel 

The main loop of the code is inside a #pragma omp. In 

this way, the overhead of creating (and destroying) new 

threads needs to be computed only once for all the execution. 

It mainly consists of iteration over time steps, which 

are not parallelizable, and which need synchronization 

of the threads which work inside each iteration. Each 

time step iteration is divided into two blocks; a sequential 

block and a parallel block. The sequential block performs 

three different operations: 

• Swap Vi by Vr. As we mentioned before, Vi and 

Vr must be stored in separate memory addresses in 

order to enable parallelization. At the beginning of 

a time step, the reflected pulses from the previous 

iteration become the incident pulses on the neighbor 

nodes. In this implementation, the pointers of the 

vectors Vi and Vr are only exchanged, and the 

complexity of neighbor swapping the pulses is done 

implicitly in the matrix calculations, avoiding extra 

reading and writing to memory, although adding a 

small penalty in processing and complexity to the 

code. 

• System feeding. The initial electromagnetic problem 

may have sources on its initial definition. These 

sources bring external voltage pulses to the system, 

which are added in this part of the code. 

• Output storage. Some key nodes are marked as 

output and, therefore, the temporal evolution of 

their voltages is necessary to reconstruct the fields’ 


evolution afterwards. All the line voltages at each 

time step from these output nodes are stored in 

memory. 

The parallel block is in charge for the matrix multiplication 

of each node. It is composed of a parallel for. 

Since the Vr calculation can be performed independently 

for each node, the OpenMP directive is in charge to 

distribute the computations between the available number 

of threads. Therefore, each thread will compute a portion 

of the total range of i. Since the clause schedule 

(static) is present, all the available threads will 

iterate the same portion of the i range. 

In order to reduce the total time of computation, 

several optimizations have been included here, which 

make the real code hard to read. One of them reduces 

the number of multiplications, by identifying operations 

which are performed several times for the calculation of 

different reflected pulses. Another, the most complex one, 

deals with the neighboring of the nodes. It is implemented 

in such a way that the nodes on the edges of the initial 

geometry are treated in a different manner than the 

internal nodes are. The code is different but the amount 

of computation remains similar, except for the nodes 

which are edge in more than one of its sides. In this case 

the computations are larger. The number of nodes being 

edge for more than one side is usually small on most 

geometries. This is true for the models considered in this 

work due to the fact that scheduling the parallel for 

as static improves the performance of the algorithm, 

although few nodes require a bit more computation than 

others. 

IV. MODELLING THE EARTH-IONOSPHERE 

WAVEGUIDE 

The algorithm described above has been employed to 

simulate the Earth-ionosphere waveguide. The surface of 

Earth behaves like a good conductor in the Very Low 

Frequency range (VLF, i.e., in the order of kHz), with 

conductivity ∼10−2 S/m for ground and ∼3.2 S/m for 

sea water [16]. There is air above the ground, which is 

of dielectric nature. As the altitude increases, the number 

of free electrons increases too, the density of neutral 

decreases, and the air starts behaving like a conductor. 

A typical conductivity profile with altitude dependence 

is shown in Fig. 2 (top) [17]. 

The excitation sources of the waveguide are lightning 

strokes [18]. They generate a broadband signal which 

differs in orientation, strength and duration depending of 

its nature (cloud to ground, cloud to cloud, Q-bursts, etc.). 

A typical stroke in positive cloud to ground lightning 

is depicted in Fig. 2 (bottom) [19], which has been 

employed as excitation in the problem considered. 

The signal originated by the stroke travels a certain distance, 

guided between the two plates before extinguishing 

due to losses. On a first approximation, the system 

can be regarded as the infinite parallel-plate waveguide. 

According to [20], the cut-off frequencies for a lossless 

waveguide of this geometry are located at:

Fig. 2. Conductivity profile of the atmosphere with altitude (top), 

extracted from [17], and typical current for cloud to ground lightning 

(bottom). 

Amplitude [a.u.] 

6 

5 

4 

3 

2 

1 

Vertical electric field as a function of frequency 

lossy waveguide 

lossless waveguide 

0 

1000 2000 3000 4000 5000 6000 7000 8000 

Frequency (Hz) 

Fig. 3. Detail of the first and second cut-off frequencies for the lossless 

and the lossy Earth-ionosphere waveguide. 

fn = nc 

(2) 

2h 

where c is the speed of light in vacuum, h is the 

distance between the parallel plates, n is the mode 

number, and fn the associated cut-off frequency of the 

mode. 

The problem has been modelled with the algorithm, 

both for a lossless and for a lossy waveguide. For the 

lossless waveguide, the conductivity is supposed to be 

zero in the dielectric. For the lossy waveguide, the night 

conductivity profile from Fig. 2 (top) is applied. The 

parallel plates are taken as perfect conductors in both 

cases. Details of the first and second cut-off frequencies 

are depicted in Fig. 3, which correspond to electric 

field in the z-direction, at a distance of 45 km in ydirection 

from the source (see Fig. 4 for definition of 

the directions). It is interesting to note the effect of 

the conductivity, which increases the value of the cutoff 

frequencies, being equivalent to having a narrower 

waveguide. 

A. Algorithm Benchmarking 

The total execution time of the waveguide model over 

different computers has been measured, using a different 


Fig. 4. Spatial arrangement of the waveguide model. 

total time of execution (s) 

2000 

1000 

500 

200 

100 

Scalability of TLM algorithm 

Absolute time of execution Relative speedups 

SM32 

SM32 round-robin 

SM8 


0 5 10 15 20 25 30 

Number of CPUs 

speedup = time 1core / time nCores 

18 

16 

14 

12 

10 

8 

6 

4 

2 

SM32 


SM8 


0 

0 5 10 15 20 

Number of CPUs 

25 30 

Fig. 5. Total time of execution (left) and relative speedups (right) for 

the different platforms. 

number of CPUs, in order to determine the scalability of 

our algorithm. Two different computers have been used 

in the benchmarking process: 

• SuperMicro8 (SM8). Server with 2 AMD opteron 

quad-core processors 2.0 GHz and 32 GB RAM, in 

Not Uniform Memory Access (NUMA) configuration. 

The OS is OpenSUSE 11.4 and the compiler 

employed is opencc 4.2.4 (level 2 of optimization). 

• SuperMicro32 (SM32). Server with 4 AMD opteron 

eight-core processors 2.0 GHz and 96 GB RAM, in 

Not Uniform Memory Access (NUMA) configuration. 

The OS is OpenSUSE 11.4 and the compiler 

employed is opencc 4.2.4 (level 2 of optimization). 

The problem benchmarked makes use of symmetry 

and the initial grid is two-dimensional. According to 

Fig. 4, the symmetry is applied in the x-direction. The 

conductivity profile is extended along z, and the output 

measured at a certain distance along the y-direction. The 

node size is 1.5 km, the time step is 2.5 μs, the number 

of time steps is 7,500, and the total number of nodes is 

∼106 (67 nodes in z, 15,000 nodes in y). The excitation is 

placed next to the ground, at the center of the waveguide. 

With this configuration, a total of 7.5·109 step-node 

computations must be performed to solve the problem. 

The total execution time and relative speedups are shown 

in Fig. 5, for both platforms. The relative speedup is 

defined as the execution time using n cores divided by 

the execution time using 1 core. A maximum speedup of 

6 is achieved with SM8, when making use of its 8 CPUs. 

On SM32, a maximum speedup of 16 is obtained when 

using 30 CPUs. 

As it can be observed in Fig. 5, two benchmarks 

have been measured for each computer. The aim is to 

compare the performance when using or not Round- 

Robin memory allocation policy. This policy consists in 

requesting the operative system to balance the memory 

reservation equally among the different portions of RAM.

This can be accomplished via the numactl tool [21]. 

If this policy is not enabled, the memory reservation 

will be performed sequentially, and only some memory 

blocks will be used. Since the computers employed have 

NUMA architecture, each processor can access faster to 

a certain RAM circuit, while the access to the others is 

slower. Moreover, if the Round-Robin policy is not set, 

the different CPUs will have to compete to gain access 

to the particular RAM circuit, slowing down the overall 

computation. This technique is especially effective for a 

large number of CPUs (see Fig. 5). 

V. MODELLING THE WHOLE EARTH-IONOSPHERE 

CAVITY 

In this section, the whole cavity is considered in the 

simulation, leading to a much more time and RAM memory 

consuming model. The cavity has been considered as 

the space between two concentric spheres of 6,370 and 

6,470 km, with perfect conducting walls at the borders 

and no conductivity at the interior. The spherical shell 

has been modeled by cubic nodes, in this case of Δl=5 

km of size. The total number of nodes is ∼4.14·108 ,and 

the amount of RAM required is ∼61.5 GBytes. Around 

1.1 GBytes are employed for storing the outputs. For a 

spatial grid with a 5 km resolution, the time step required 

is 8.34 μs. The number of time iterations calculated was 

2.4·105 , and therefore the simulated time length is ∼2 

s. A frequency resolution of 0.5 Hz is achieved when 

the FFT is computed with these parameters. The total 

execution time required when using 32 cores on SM32 

(see Section IV-A) is roughly 6.0·105 s, i.e., around seven 

days. 

The excitation source of the cavity has been located at 

θ=0 and r=6,372 km, i.e., the North Pole. The excitation 

corresponds to a vertical positive Cloud to Ground (+CG) 

lightning, and its current is shown in Fig. 2 (bottom). This 

stroke starts at t=0, and lasts for 500 μs. 

With this spatial arrangement, the problem has symmetry 

over the φ coordinate, and therefore the outputs had 

been located all φ=0. A total of 101 nodes are marked as 

output, and they are equally spaced along the coordinate 

θ, from 0 to π, for r=6,370 km, i.e., at the surface, 

because it is the common location for SR measurements. 

As SR analytical models state [22] [23], the two 

relevant components of the electromagnetic field are Er 

and Hφ. In Figure 6, the six components of the output 

corresponding to θ=π/4 have been plotted, in order to 

show this fact. The other output nodes show similar 

results, where the two components mentioned are much 

greater than the rest. 

In order to corroborate the results from the simulations, 

the relationship between the modal amplitude of the six 

first SR and the angular distance to the source for the 

101 nodes marked as output (θ=0, θ=π/100,..., θ=π) has 

been plotted in Figure 7. This result is in agreement 

with analytical model results [22] [23], which show the 

amplitude dependence of SR modes with the distance to 

the source. 


Fig. 6. Spectra of Electric (left) and Magnetic (right) field components. 

The relevance of Er and Hφ can be observed. 

Hphi [T/sqrt(Hz)] 

6e-09 

5e-09 

4e-09 

3e-09 

2e-09 

1e-09 

Dependence of SR amplitude with distance to the source (θ), lossless cavity 

SR1 

SR2 

SR3 

0 

0 π/4 π/2 

θ [rad] 

3π/4 π 

Hphi [T/sqrt(Hz)] 

1.2e-08 

1e-08 

8e-09 

6e-09 

4e-09 

2e-09 

SR4 

SR5 

SR6 

0 

0 π/4 π/2 

θ [rad] 

3π/4 π 

Fig. 7. Dependence of SR modal amplitude with θ, for the lossless 

cavity. 

The simulation has been repeated changing only the 

size of the spatial grid to Δl=10 km. Doubling the size 

of the nodes reduces by a factor of eight the number 

of nodes, at the cost of a poorer fitting of the spherical 

geometry and worse spatial resolution. The maximum 

valid frequency is also reduced by a factor of two, but 

this is not important for the study of SR, because the 

top frequency is still 3 kHz (the condition is λ ≥ 10Δl). 

The amount of memory required is reduced to 9.1 GBytes 

(with 1.1 GBytes for storing the results). The execution 

time, again with 32 cores in SM32, is reduced to 7.6·10 4 

s, i.e., roughly 21 hours. The magnetic fields in φ direction 

at an angular distance of π/4 of the two simulations 

are compared in Figure 8. 

The six maxima from each spectra of Hφ have been 

extracted and averaged, with the aim of using them as a 

proxy of the resonance position. The results are shown 

in Table I, for both simulations. 

It can be observed that the results for the central frequencies 

of the six SR are similar in the two simulations 

and with the results from the analytical solution. For the 

case of the 10 km size simulation, the errors for the 

central frequencies are always under 3%. This error is

Amplitude [a.u.] 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

10 km 

5 km 

Comparison of H φ at θ=π/4 

0 

0 10 20 30 40 50 


Fig. 8. Comparison of Hφ at θ=π/4, for the two simulations (5 km 

and 10 km). 

TABLE I 

SR CENTRAL VALUES IN HZ, LOSSLESS CAVITY. 

1st SR 2nd SR 3rd SR 4th SR 5th SR 6th SR 

10 km 10.24 17.74 24.98 32.35 39.63 46.93 

5km 10.47 17.99 25.48 32.96 39.98 47.47 

Analytical 10.51 18.20 25.75 33.24 40.71 48.17 

reduced to less than 1.5% for the 5 km simulation. 


The TLM is briefly described and the inherent parallel 

areas of the algorithm are pointed out. It has been 

parallelized for shared memory architectures, by using 

OpenMP. The solution obtained has been employed to 

simulate the Earth-ionosphere waveguide, and to observe 

the changes produced by the conductivity profile over 

the cut-off frequency, by comparing the results with 

the lossless waveguide. The effect of this conductivity 

is to increase the value of the cut-off frequencies, 

being equivalent to having a narrower waveguide. The 

algorithm has been run on two different platforms and 

benchmarked. The algorithm scales up to a speedup of 

16 by using 30 CPUs. In order to obtain the maxima 

speedups, it is necessary to set a policy of Round-Robin 

memory allocation, in order to minimize the effects of 

the NUMA architecture. Finally, a huge simulation (the 

whole Earth-ionosphere cavity with a 5 km resolution) 

has been performed for validation of the parallel algorithm. 

The lossless version of the cavity is solved, and 

the electromagnetic fields obtained are consistent with 

the analytical solution of the cavity. Since a Cartesian 

grid of only 5 km size per cell was employed, the 

errors are lower than 1.5% for the Schumann resonance 

frequencies. The spatial resolution achieved enables the 

possibility of using this model to study the global effects 

generated by local phenomena in the Earth-ionosphere 

cavity. 

Acknowledgments: This work was supported by the 

Consejería de Innovación, Ciencia y Empresa of Andalusian 

Government and Ministerio de Ciencia e Innovación 


of Spain under projects with references PO7-FQM-03280 

and FIS2010-15170, co-financed with FEDER funds of 

the European Union. 

REFERENCES 

[1] Flynn, L.J.: Intel halts development of 2 new microprocessors, The 

New York Times, May 8, 2004, retrieved on March 2, 2011 (2004) 

[2] Yu, W., Yang X., Liu, Y., Ma, L.-C., Su, T., Huang, N.-T., Mittra, 

R., Maaskant, R., Lu, Y., Che, Q., Lu, R., Su, Z.: A new direction 

in computational electromagnetics: solving large problems using 

the parallel FDTD on the BlueGene/L Supercomputer providing 

teraflop performance, IEEE antennas and Propag. Mag., 50(2), 26– 

44 (2008). 

[3] Christopoulos, C.: The Transmission-Line Modelling Method, 

TLM, IEEE Press, Piscataway, N.J. (1995) 

[4] Johns, P.B, Beurle, R.L.: Numerical solution of 2-dimensional 

scattering problems using a transmission-line matrix, Proc. Inst. 

Elec. Eng., 118(9): 1203–1208, 1971. 

[5] De Cogan, D., Pulko, S.H., O’Connor, W.J.: Transmission-Line 

Matrix in computational mechanics, CRC Press, Boca Raton, Fla. 

(1995) 

[6] Enders, P., Pulko, S.H., Stubbs, D.M.: TLM for diffusion: consistent 

first time step. Two-dimensional case, International Journal of 

numerical modelling-electronic networks devices and fields, 15(3), 

251–259 (2002) 

[7] Morente, J.A., Molina-Cuberos, G.J., Portí, J., Besser, B.P., Salinas, 

A., Schwingenschuch, K., Lichtenegger, H.: A numerical simulation 

of Earth’s electromagnetic cavity with the Transmission Line 

Matrix method: Schumann resonances, J. Geophys, Res., 108(A5), 

1195–1205 (2003) 

[8] S. Toledo-Redondo, Parrot, M., and Salinas, A., “Variation of the 

first cut-off frequency of the Earth-ionosphere waveguide observed 

by DEMETER”, J. Geophys. Res., vol. 117, pp. A04321, 2012. 

[9] Cummer, S.A.:Modeling electromagnetic propagation in the Earthionosphere 

waveguide, IEEE Trans. ant. Propag., 48(9), 1420–1432 

(2000) 

[10] Kirk, D.B., Hwu, W.W.: Programming massively parallel processors, 

a hands-on approach, Morgan Kaufmann, Burlington, M.A. 

(2010) 

[11] Chapman, B., Jost, G., van der Pas, R.: Using OpenMP: Portable 

Shared Memory Parallel Programming, The MIT Press, Cambridge, 

Massachussets (2007) 

[12] Breshears, C.: The art of concurrency, a thread monkey’s guide 

to writing parallel applications, O’Reilly, Sebastopol, CA (2009) 

[13] Morente, J.A., Jiménez, G., Portí, J., Khalladi, M.: Dispersion 

analysis for TLM mesh of symmetrical condensed nodes with 

stubs, IEEE Trans. Microwave Theory Tech. 43(2), 452–456 (1995) 

[14] Johns, P.B.: A symmetrical condensed node for the TLM method, 

IEEE Trans. Microwave Theory Tech. 35(4), 370–377 (1987) 

[15] Naylor, P., Desai, R.A.: New three dimensional symmetrical condensed 

lossy node for solution of electromagnetic wave problems 

by TLM, Electron. Lett., 26(7), 492-494 (1990) 

[16] Rycroft, M.J., Harrison, R.G., Nicoll, K.A., Mareev, E.A.: An 

overview of Earth’s global electric circuit and atmospheric conductivity, 

Space Sci. Rev., 137, 83–105 (2008) 

[17] Pechony, O., Price, C.: Schumann resonance parameters calculated 

with a partially uniform model on Earth, Venus, Mars, and 

Titan, Radio Sci., 39, RS5007 (2004) 

[18] Storey, L.R.O.: An investigation of whistling atmospherics, Philosophical 

transactions of the royal society of London series A - 

Mathematical and physical sciences, 246(908), 113–141 (1953) 

[19] Baba, Y., Rakov, V.A.: Present understanding of the lightning return 

stroke, in Lightning: Principles, instruments and applications, 

Springer (2009) 

[20] Cheng, D.K.: Field and wave electromagnetics, Addison-Wesley 

(1989) 

[21] Linux Manual pages, numactl(8), 

http://linuxmanpages.com/man8/numactl.8.php 

[22] Sentman, D.D., Schumann Resonances, in Handbook of atmospheric electrodynamics, 

CRC Press, Boca Raton, Fla, (1995) 

[23] Toledo-Redondo, S., Salinas, A., Portí, J., Morente, J.A., Fornieles, J., 

Méndez, A., Galindo-Zaldívar, J., Pedrera, A., Ruiz-Constán, A., and Anahnah, 

F., Study of Schumann resonances based on magnetotelluric records 

from the western Mediterranean and Antarctica, J. Geophys. Res., 115, D22, 

114, (2010).


A Novel Parametric Model Order Reduction 

Approach with Applications to Geometrically 

Parameterized Microwave Devices 

Stefan Burgard∗ , Ortwin Farle∗ , and Romanus Dyczij-Edlinger∗ ∗Chair for Electromagnetic Theory, Saarland University, D-66123 Saarbrücken, Germany 

E-mail: edlinger@lte.uni-saarland.de 

Abstract—Methods of model-order reduction approximate the transfer behavior of a given high-dimensional system by that 

of a low-order one, which is much faster to evaluate. In the parametric case, the system features additional parameters, 

such as material properties or geometric design variables. The parametric order-reduction methods available today still 

exhibit a number of limitations, particularly with respect to convergence rates and the size of the reduced-order model. 

This contribution presents a novel technique based on affine parameter reconstruction and parameter-dependent projection 

matrices. It features high rates of convergence, supports local adaptation, and yields reduced-order models that are of very 

low dimension and thus fast to evaluate. 

Index Terms—Computer-aided engineering, geometric parameters, parametric model order reduction, parametric models. 


This paper addresses microwave components with linear 

time-invariant system properties. Since most practical 

structures possess complicated shape and inhomogeneous 

material properties, their fields-level analysis requires numerical 

methods, such as the finite-element (FE) method. 

FE discretization in the frequency domain results in 

systems of linear equations which are characterized by 

sparse matrices of high dimension. While solving a FE 

system at one single operating frequency may not be particularly 

demanding on modern computers, the analysis 

of broad frequency bands still tends to be very timeconsuming. 

The situation gets even worse when multiple 

parameters, such as material properties or geometric 

design variables, are present, and entire response surfaces 

are to be computed. 

Methods of model-order reduction (MOR) address this 

issue by approximating the behavior of the original system 

by a reduced-order model (ROM) that is very cheap 

to solve. As long as the frequency is the sole parameter, 

powerful single-point [1], [2] or multi-point [3], [4] 

algorithms are readily available. The incorporation of 

additional parameters, especially those of geometric type, 

still poses challenges, with respect to convergence rates, 

computing times, and model dimension. One particular 

difficulty with geometric parameters is that they enter the 

FE matrices in the form of multivariate rational functions 

of complicated structure. The authors use the technique 

of [5] and [6] for affine geometry approximation. 

Present parametric model-order reduction (PMOR) 

techniques fall under two categories: The one class comprises 

methods [7], [5] that employ one global projection 

space for all parameters, including the frequency. It is 

characteristic of such entire-domain methods that the 

ROM dimension is large and rises quickly with increasing 

size of the parameter domain. The other class includes 

methods that instantiate frequency-domain ROMs for 

a set of sampling points in the domain of geometric 

parameters and employ interpolation over sub-domains 

to account for geometry variations. Thanks to their local 

nature, the resulting sub-domain ROMs are of low dimension. 

While existing techniques [8] - [11] interpolate 

the frequency-domain ROMs directly, the method proposed 

in this paper interpolates projection matrices. One 

particular advantage of this approach is to decouple the 

approximation of the effects of geometric parameters on 

the FE matrices, which may be the dominant source of 

error, from the actual PMOR process. 

The remainder of the paper is organized as follows: 

Section II presents the underlying parameter-dependent 

FE system. The treatment of geometric parameters is 

reviewed in Section III. The new PMOR approach is 

developed in Section IV. This constitutes the main contribution 

of the paper. Section V gives numerical examples 

that demonstrate the benefits of the suggested approach. 

A brief summery in Section VI closes the paper. 

II. ORIGINAL SYSTEM 

We consider a time-harmonic electromagnetic FE system 

of dimension N which possesses Q inputs and outputs, 

respectively, and depends on the frequency f ∈ R 

and a vector p ∈P⊂RP of P geometric parameters. 

The input vector is denoted by u, the generalized state 

by x(f,p), and the output by y(f,p). The system is 

assumed to be of the form 

I 

J 

 

φi(f)Ai(p) x(f,p) = θj(f)Bj u, (1a) 

i=1 

j=1 

J 

y(f,p) = ηj(f)B 

j=1 

T 

j x(f,p), (1b) 

wherein Bj ∈ RN×Q , and the functions φi,θj,ηj : R → 

C and Ai : RP → RN×N are continuous. Eq. (1) implies

that the topology of the FE mesh, i.e. the number and 

connectivity of the FE nodes, must remain the same over 

the whole parameter domain P. Meshes of this kind 

can be constructed for a wide class of parameterized 

geometries by, e.g., the morphing method of [12]. 

III. GEOMETRY INTERPOLATION 

In many cases, the parameter-dependent matrices 

Ai(p) are just multivariate rational functions. Nevertheless, 

their explicit representation [13] is quite complex 

in practice, because it requires tracking the effects of 

all geometric parameters from the solid model through 

the mesh generation process to the FE matrix generation 

stage. Therefore, the present paper follows the suggestion 

of [5] and [6] to approximate Ai(p) by a function of 

simpler structure. We set 

Ai(p) ≈ 

Γβ(p)A 

β 

β 

i for p ∈P, (2) 

wherein β =[β1,...,βP ] is a multi-index, A β 

i ∈ CN×N , 

and Γβ : R P ↦→ R is a suitable interpolation function. 

Thus, the approximate system reads: 

 

φi(f) 

i 

 

Γβ(p)A 

β 

m 

i x ′ (f,p)= 

θj(f)Bju, (3a) 

y 

j 

′ 

(f,p) = 

 

x ′ (f,p), (3b) 

ηj(f)B 

j 

T j 

The interpolation functions Γβ are obtained as follows: 

For each parameter p, choose a set Np of interpolation 

points ψ k p, 

Np = ψ k p ∈ R 

k =1,...,Kp , (4) 

and associated interpolation functions γ k p : R → R with 

γ k p (ψ l p)=δkl for ψ l p ∈Np. (5) 

Next construct a tensor-grid G = N1 × ... ×NP for 

the domain P. Then the interpolation point pβ and 

interpolation function Γβ are given by 

pβ =[ψ β1 

1 ,...,ψβP P ], (6a) 

Γβ(p) =γ β1 

1 (ψ1) · ...· γ βP 

P (ψP ). (6b) 

By (6) and (2), the interpolation matrices A β 

i 

are given 

by the system matrices Ai of the original FE system (1) 

at the interpolation point pβ: 

A β 

i = Ai(pβ). (7) 

IV. PARAMETRIC ORDER REDUCTION 

We construct the parametric ROM by replacing the 

test and trial space, respectively, of the interpolated FE 

system (3) by an n dimensional subspace S(p) which 

depends continuously on the parameter vector p. For 

this purpose, a Galerkin procedure based on a parameterdependent 

projection matrix V(p) :RP → RN×n , with 

S(p) =range {V(p)} , (8) 


1 1 

1 2 

[ , ] 

1 2 

1 2 

[ , ] 

1 3 

1 2 

1,1 

1,2 

2 1 

1 2 

[ , ] 

2 2 

1 2 

2,1 

2,2 

3 1 

1 2 

[ , ] 

3 2 

[ , ] [ , ] 

1 2 

2 3 3 3 

[ , ] [ , ] [ , ] 

Fig. 1. Hypercube topology 

1 2 

1 2 

is applied to (3). The resulting ROM is of the form 

 

φi(f) 

i 

 

Γβ(p) 

β 

Ãβi 

(p) 

 

˜x = 

θj(f) 

j 

˜ Bj(p)u, (9a) 

 

˜y(f,p) = ηj(f) ˜ B T 

j (p) ˜x(f,p), (9b) 

with 

j 

Ã β 

i (p) =VT (p)A β 

i V(p), (10a) 

˜Bj(p) =V T (p)Bj. (10b) 

As long as n ≪ N, the frequency response of the ROM 

can be computed much more efficiently than the original 

one. 

A. Parameter dependent projection matrix 

We start by computing n dimensional one-parameter 

ROMs with respect to frequency at all interpolation 

points pβ ∈ G. The resulting projection matrices are 

denoted by ˆ Vβ ∈ C N×n . 

The interpolation points pβ ∈Gsubdivide the param- 

eter domain into hypercubes H β ⊂ R P . Based on the 

line segments L k p = ψ k p,ψ k+1 

p 

H β = L β1 

1 

,wehave 

× ...×LβP P . (11) 

Fig. 1 illustrates the setting in the two-dimensional case. 

Starting from one-dimensional hat functions ξ k p : R → R, 

ξ k p (ψ) = 

⎧ 

ψ k−1 

p −ψ 

for ψ ∈Lk−1 p , 

⎪⎨ ψ 

⎪⎩ 

k−1 

p −ψk p 

ψ k+1 

p −ψ 

ψ k+1 

p −ψk for ψ ∈L 

p 

k p, 

0 else, 

(12) 

we construct piecewise multi-linear interpolation functions 

Ξβ : R P → R of compact support: 

Ξβ(p) =ξ β1 

1 (ψ1) · ...· ξ βP 

P (ψP ). (13) 

Within a given hypercube Hα , the parameterdependent 

projection matrix V(p) is defined by 

V(p) = 

Ξβ(p) ˆ VβT α β for p ∈H α . (14) 

pβ∈H α

Herein, the matrices T α β ∈ Rn×n are provided in order 

to conduct state transformations. They are constructed 

as follows: Following [8] and [9], a singular value 

decomposition [14] is performed, 

 

= U diag σW H , (15) 

 

ˆVβ1 ,..., ˆ Vβ (2P ) 

to determine a basis Rα ∈ RN×n for the n dimensional 

subspace of highest energy over the hypercube Hα , i.e., 

the subspace corresponding to the n largest singular 

values: 

Rα = U(:, 1:n). (16) 

For any relevant state Rα˜x, we require the ROM state 

at the interpolation point pβ, ˆ VβTα β ˜x, to be as close as 

possible: 

! 

=min ∀˜x ∈ C n , (17a) 

Rα˜x − ˆ VβT α β ˜x2 

⇒ ˜x − R H α ˆ VβT α β ˜x = 0 ∀˜x ∈ Cn . (17b) 

Thus, 

T α β = 

 

R T α ˆ −1 Vβ . (18) 

Eq. (18) underlines that interpolating the bases ˆ Vβ directly, 

which is equivalent to taking Tα β = I, may cause 

gross error. 

B. Assembly 

Plugging (14) into (10) leads to the following representation 

of the reduced matrices within the hypercube Hα : 

Ã β 

 

i (p) = 

pγ∈Hα 

pδ∈Hα Ξγ(p)Ξδ(p) A β 

i,γ,δ , (19a) 

˜Bj(p) = 

Ξγ(p)Bj,γ, (19b) 

wherein 

pγ∈H α 

A β 

i,γ,δ =(Tα γ ) T V T γ A β 

i VδT α δ , (20a) 

Bj,γ =(T α γ ) T V T γ Bj. (20b) 

Note that all the coefficient matrices in (20) are of 

reduced size and can be computed in advance. No O(N) 

operations are required during the solution process. 

V. NUMERICAL EXAMPLES 

In the examples below, the single-parameter ROMs 

with respect to frequency at the interpolation points are 

computed by means of the single-point algorithm of [1]. 

A. Dielectric Post 

Fig. 2 shows the H plane filter of [16]. It consists of 

a dielectric post at the center of an air-filled rectangular 

waveguide. The model features two parameters: the operating 

frequency f ∈ [15, 25] GHz and the geometric 

parameter p ∈ [−1.5, 1.5] mm which defines the radius r 

of the post according to 

r =2.5 mm + p. (21) 

Fig. 3 shows instantiations of the parametric mesh [12] 

for p ∈{−1.5, 0, 1.5} mm. 


10 

Γ (1) 

WG 

Ω 

10 

ɛd 

20 

μd 

Γ (2) 

d 

ɛr = μr =1 

Γ (1) 

d 

2r 

Γ (2) 

WG 

5 

Fig. 2. Structure of rectangular waveguide filter [16]. All dimensions 

are in mm. Material properties of rod: relative electric permittivity ɛd = 

4, relative magnetic permeability μd =1. Waveguide ports are denoted 

by Γ (1) 

WG and Γ(2) 

WG , respectively. 

Fig. 3. Instantiations of the parametric FE mesh for different values 

of the geometry parameter: p ∈ {−1.5, 0, 1.5} mm. Note that the 

meshes share the same topology. 

1) Response Surface and Errors: Fig. 4 presents the 

response surface of the magnitude of the reflection coefficient 

|S11|. The parametric ROM is based on M =5 

interpolation points placed at the locations of the zeros 

of the fifth-order Chebyshev polynomial of the first kind. 

The expansion frequency for the single-parameter ROMs 

is set at the center of the frequency band, f exp =20GHz. 

We define the error in S11 by 

ES11 (f,p) = S11(f,p) − S11(f,p), (22) 

wherein ˜ S11 denotes the PMOR result, and S11 is the 

reference solution, which is computed by conventional 

FE analysis, using the same mesh. The complete error 

surface is given in Fig. 5. It can be seen that errors are 

in the order of 10 −3 , which is below the typical level of 

the FE discretization error. Note that calculating the error 

surface is only possible for very simple structures, like 

the present filter, because each of the 101×101 sampling 

points in f − p space requires a separate FE run.

Fig. 4. Dielectric post: Response surface of the magnitude of the 

reflection coefficient S11 as a function of operating frequency f and 

radius variation p. 

Fig. 5. Dielectric post: Error surface of |S11|. 

Computational data for conventional FE analysis and 

two different PMOR models, ROM3 and ROM5, using 

M =3and M =5interpolation points, respectively, 

are given in Table I. It can be seen that, even though the 

dimension of the original FE system is very small, the 

larger of the two models, ROM5, is still 150 times faster 

to evaluate. 

2) Analysis of Suggested Procedure: Our first goal 

is to compare the proposed method, which employs 

TABLE I 

COMPUTATIONAL DATA FOR DIELECTRIC POST. 

Model ROM5 ROM3 FE 

Number of grid points 5 3 - 

Moment-matching order 10 7 - 

Dimension 22 16 5616 

Model generation (s) ∗ 365.8 177.7 - 

Evaluations per s∗ 481.0 757.3 3.2 

|Average error in S11| 4.99 · 10−4 5.93 · 10−3 0 

∗ MATLAB code on Intel Pentium 4 (3GHz), one thread used. 


|E | 

S11 

10 0 

10 −1 

10 −2 

10 −3 

10 −4 

Direct interpolation of ROM bases 

Present method 

10 

−1.5 −1 −0.5 0 0.5 1 1.5 

−5 

Radius variation (mm) 

Fig. 6. Dielectric post: Magnitude of error in reflection coefficient S11 

as a function of radius variation p at f =25GHz. Note the positive 

effects of state transformations in the present method. 

state transformations, to direct interpolation of the ROM 

bases ˆ Vβ. For this purpose, we consider a ROM based 

on M =5equidistant sampling points. Fig. 6 presents 

the error in S11 (22) as a function of radius variation p 

at f =25GHz: The necessity of proper state transformations 

is evident. 

The next test addresses the rate of convergence of the 

proposed method. We start from 3 equidistant interpolation 

points at refinement level r =1, and refine the grid 

recursively. Thus, the total number of points at refinement 

level r is 

|G| =2 r +1. (23) 

Our measure of error is ĒS11 , the average error in S11 

at f =25GHz, 

ĒS11 

= 1 

Ns 

Ns 

n=1 

| S11(pn) − S11(pn)|, (24) 

based on Ns = 257 equally spaced sampling points, 

pn ∈ [−1.5, 1.5] mm. Fig. 7 presents the magnitude of 

the average error as a function of refinement level for 

direct ROM interpolation, a variant of the present method 

which uses piecewise linear geometry interpolation, and 

the suggested approach, employing global polynomial 

interpolation for the geometry. The results of Fig. 7 

show that the total error is dominated by the effects 

of geometry interpolation: The suggested method clearly 

outperforms competing approaches. 

B. Mitered Microstrip Bend 

Our second example, the mitered microstrip bend 

shown in Fig. 8, is a truly three-dimensional structure 

with more than one million FE unknowns. The 

model features two parameters, the operating frequency 

f ∈ [1, 10] GHz and a geometric parameter p ∈ 

[−0.7, 0.7] mm which controls the width t of the mitered

|Average error in S 11 | 

10 0 

10 −1 

10 −2 

10 −3 

10 −4 

10 −5 

10 −6 

Proposed approach − global polynomial 

Proposed approach − piecewise linear 

ROM interpolation − piecewise linear 

0 1 2 

Refinement level 

3 4 

Fig. 7. Dielectric post: Magnitude of average error versus grid 

refinement level r at f = 25 GHz. The proposed method benefits 

from interpolating the FE matrices by polynomials of higher-order. 

μr,ɛr 

2.413 

t 

60 

60 0.794 

Fig. 8. Structure of a mitered microstrip bend. Dimensions are in mm. 

Material properties of substrate: ɛr =2.2, μr =1. 

bend. We have: 

t =1.7062 mm + p. (25) 

Again, the parametric ROM is based on M = 5 

interpolation points placed at the locations of the zeros 

of the fifth-order Chebyshev polynomial of the first kind. 

The expansion frequency for the single-parameter ROMs 

is set to f exp =5GHz. 

Fig. 9 shows the response surface of the magnitude of 

the reflection coefficient S11, calculated by the proposed 

method. Fig. 10 presents |S11| and the corresponding 

error |ES11 | (22) with respect to conventional FE simulations 

versus frequency for the case p =0.2 mm. The 

fact that the error is always more than 25 dB below the 

signal level underlines the high quality of the ROM. 

Table II provides computational data for conventional 

FE simulation and the ROM. It can be seen that it 

takes more than 2 hours to build the parametric ROM. 

However, once the ROM is available, it can be evaluated 

more than 2200 times per second. For comparison, one 


Fig. 9. Mitered microstrip bend: Response surface of the magnitude 

of the reflection coefficient |S11| as a function of operating frequency 

f and miter parameter p. 

|S 11 | (dB) 

−30 

−40 

−50 

−60 

−70 

−80 

−90 

−100 

Proposed approach 

Error of proposed approach 

2 4 6 

Frequency (GHz) 

8 10 

Fig. 10. Mitered microstrip bend: |S11| and error |ES11 | versus 

frequency. Parameter: p =0.2 mm. 

conventional FE solution takes 180 s, which is more than 

400 000 times longer! 

VI. CONCLUSIONS 

This paper has presented a PMOR methodology for 

FE models with geometric parameters. It is characteristic 

of the new approach that geometry approximation 

is separated from the actual ROM generation process. 

Moreover, the suggested method incorporates state transformations 

that improve the quality of the interpolated 

projection matrices. In consequence, the present PMOR 

method reaches higher rates of convergence than previous 

approaches. Since the resulting parametric models are of 

small dimension, they are very fast to evaluate. 

TABLE II 

COMPUTATIONAL DATA FOR MICROSTRIP BEND. 

Model ROM FE 

Number of grid points 5 - 

Moment-matching order 20 - 

Dimension 42 1,175,382 

Model generation (s) ∗ 7513.2 - 

Evaluations per s ∗ 2267.9 5.56 · 10 −3 

|Avr. error in S11| at p =0.2 mm 1.06 · 10 −4 0 

∗ MATLAB code on Intel Xeon E5620, one thread used.

REFERENCES 

[1] R. D. Slone, R. Lee, and J. F. Lee, “Broadband Model Order 

Reduction of Polynomial Matrix Equation using Single-Point 

Well-Conditioned Asymptotic Waveform Evaluation: Dreivations 

and Theory,” Int. J. Numer. Meth. Eng., vol. 58, pp. 2325 – 2342, 

Dec. 2003. 

[2] Y. Zuh, A. C. Cangellaris, “Finite Element-Based Model Order 

Reduction of Electromagnetic Devices,” Int. J. Numer. Model., 

vol. 15, pp. 73 – 92, 2002. 

[3] R. D. Slone, J.-F. Lee, R. Lee, “Automating Multipoint Galerkin 

AWE for a FEM Fast Frequency Sweep,” IEEE Trans. Magn., 

vol. 38, no. 3, pp. 637 – 640, March 2002. 

[4] A. Schultschik, O. Farle, R. Dyczij-Edlinger, “An Adaptive Multi- 

Point Fast Frequency Sweep for Large-Scale Finite Element 

Models,” IEEE Trans. Magn., vol. 45, no. 3, pp. 1108 – 1111, 

March 2009 . 

[5] R. Dyczij-Edlinger and O. Farle, “Finite element analysis of 

linear boundary value problems with geometrical parameters,” 

COMPEL, vol. 28, no. 4, pp. 779 – 794, 2009. 

[6] O. Farle, S. Burgard, and R. Dyczij-Edlinger “Passivity Preserving 

Parametric Model-Order Reduction for Non-affine Parameters,” 

Math. Comp. Model. Dyn. Sys., vol. 17, no. 3, pp. 279 – 294, 

2011. 

[7] J.R. Phillips, “Variational interconnect analysis via PMTBR,” 

ICCAD, pp. 872 – 879, 7-11 Nov. 2004. 

[8] B. Lohmann, R. Eid, “Efficient Order Reduction of Parametric and 

Nonlinear Models by Superposition of Locally Reduced Models,” 

Methoden und Anwendungen der Regelungstechnik, pp. 27 – 36, 

Aachen:Shaker-Verlag, 2009. 

[9] H. Panzer, J. Mohring, R. Eid, and B. Lohmann, “Parametric 

Model Order Reduction by Matrix Interpolation,” at - Automatisierungstechnik, 

vol. 58, no. 8, pp. 475 – 484, 2010. 

[10] O. Farle and R. Dyczij-Edlinger, “Numerically Stable Moment 

Matching for Linear Systems Parameterized by Polynomials in 

Multiple Variables with Applications to Finite Element Models of 

Microwave Structures,” IEEE Trans. Antennas Propag., vol. 58, 

no. 11, pp. 3675 – 3684, Sep. 2010. 

[11] D. Amsallem, J. Cortial, K. Carlberg, C. Farhat, “A method 

for interpolating on manifolds structural dynamics reduced-order 

models,” Int. J. Numer. Meth. Eng., vol. 80, no. 9, pp. 1241 – 

1258, Nov. 2009. 

[12] S. Burgard, Morphing von Finite-Elemente-Netzen, Studienarbeit, 

Lehrstuhl für Theoretische Elektrotechnik, Universität des Saarlandes, 

2008. In German. 

[13] J. Pomplun and F. Schmidt, “Accelerated a Posteriori Error 

Estimation for the Reduced Basis Method with Application to 

3D Electromagnetic Scattering Problems,” SIAM J. Sci. Comput., 

vol. 32, no. 2, pp. 498 – 520, 2010. 

[14] G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore:Johns 

Hopkins University Press, pp. 69 – 75, 1996. 

[15] J. Rubio, J. Arroyo, and J. Zapata, “SFELP - An Efficient Methodology 

for Microwave Circuit Analysis,” IEEE Trans. Microw. 

Theory Techn., vol. 49, no. 3, pp. 509 – 516, Mar. 2001. 

[16] J. P. Webb, “Finite Element Analysis of H-plane Rectangular 

Waveguide Problems,” Microw. Antennas Propag., IEE Proceedings 

H, vol. 133, no. 2, pp. 91 – 94, April 1986. 

- 18 - 15th IGTE Symposium 2012


Efficient Finite-Element Computation of 

Far-Fields of Phased Arrays by Order Reduction 

A. Sommer∗ , O. Farle∗ , and R. Dyczij-Edlinger∗ ∗Chair for Electromagnetic Theory, Saarland University, D-66123 Saarbrücken, Germany 

E-mail: edlinger@lte.uni-saarland.de 

Abstract—This paper presents an efficient numerical method for computing the far-fields of phased antenna arrays over 

broad frequency bands as well as wide ranges of steering and look angles. The suggested approach combines finiteelement 

analysis, projection-based model-order reduction, and empirical interpolation. Numerical results demonstrate that 

evaluation times are reduced by orders of magnitude, compared to traditional methods. 

Index Terms—Empirical interpolation, far field computation, finite-element method, and model order reduction. 


In many areas of application, such as radar or wireless 

communications, phased antenna arrays need to be analyzed 

over broad frequency bands as well as wide ranges 

of steering and look angles. Finite-element (FE) based 

analysis of such structures involves two major steps: 

First, the near field is computed as a function of angular 

frequency ω and steering angles (θs,φs). Second, a 

discrete near-field-to-far-field (NF-FF) operator is applied 

to determine the far field as a function of frequency and 

look angles (θ,φ). Using conventional approaches, this 

procedure tends to be very time-consuming. The reasons 

are as follows: Since typical antenna arrays are electrically 

large and consist of high numbers of radiators, the 

corresponding FE systems are of very large dimension. 

Moreover, broadband analysis requires large numbers of 

sampling frequencies. At each of them, the large-scale 

FE system has to solved, by matrix factorization or some 

iterative method. In addition, wide variations in steering 

angles imply large numbers of sampling angles (θs,φs). 

Each of them leads to separate excitation and right-hand 

side (RHS), respectively. Finally, wide variations in look 

angles result in large numbers of sampling points (θ,φ). 

Each of them requires a separate NF-FF transformation 

at each operating point (θs,φs; ω) of the antenna array. 

To reduce computational efforts, we propose a twostep 

approach: We first construct a reduced-order model 

(ROM) for the near fields which is very cheap to solve 

at any value of the parameter triple (θs,φs; ω). For 

this purpose, a multi-point model-order reduction (MOR) 

method with self-adaptive expansion point selection [1], 

[2] is applied. The second step utilizes the empirical 

interpolation (EI) method [3], [4] to construct an affine 

approximation to the NF-FF operator as a function of 

frequency and look angles. It, too, is very fast to evaluate 

at any value of the parameter triple (θ,φ; ω). Combining 

both steps yields a highly efficient numerical model 

for computing the far-fields as a function of the five 

parameters (θ,φ; θs,φs; ω), which is ideally suited for 

fast online evaluation: In Section VI, computing a farfield 

pattern based on 4830 look angles takes only 2.4 s, 

on a personal computer executing plain MATLAB code. 

The construction of the model by MOR and EI is 

much more time-consuming and must be performed in 

advance, in an offline step. The most expensive procedure 

in the MOR algorithm is the FE analysis of the array at a 

number of expansion points (θs,φs; ω), which are chosen 

adaptively. Note that, as long as a direct solver is used, 

changes in the three parameters are not equally expensive: 

Since ω affects the FE matrix, each frequency value 

requires a new matrix factorization, which is computationally 

expensive. On the other hand, the steering angles 

(θs,φs) enter the RHS only. Thus, changes in angle just 

require additional forward-back substitutions, which are 

much cheaper. Therefore, the adaptive point placement 

strategy of the MOR method ought to vary ω as rarely 

as possible. The new frequency-slicing greedy method 

presented in Section IV-B implements this strategy in 

a systematic fashion. For the highest-accuracy ROM of 

Section VI, it improves computing time by a factor of 7 

at the cost of increasing ROM size by 5%, compared to 

state-of-the-art methods [1], [2]. 

The numerical experiments of Section VI indicate that 

both MOR and EI feature exponential convergence, in 

accordance with theoretical results [5]. Our results for 

a real-world example [6] demonstrate that the suggested 

two-step approach for the broadband analysis of radiation 

patterns of phased arrays achieves high accuracy and reduces 

evaluation times by orders of magnitude, compared 

to conventional approaches. 

II. FAR-FIELD COMPUTATION 

By the vector Huygens principle in the frequency 

domain [7], the radiation vector F of an arbitrary antenna 

array, which is enclosed by a surface S, isgivenby 

 

ω j ˆr·r c F (ˆr,ω)=ˆr × e 0 

S 

′ 

J s (r ′ ,ω)dS ′ × ˆr 

+ 1 

 

ω j c ˆr·r 

e 0 

η0 S 

′ 

M s (r ′ ,ω)dS ′ × ˆr. (1) 

Here c0 and η0 denote the vacuum speed of light and 

characteristic impedance, respectively, and ˆr is the unit 

vector in the direction of the observer. The equivalent 

electric and magnetic surface current densities, J s and

M s, are given in terms of the electric and magnetic nearfields 

E and H by 

J s (r ′ ,ω)=ˆn × H (r ′ ,ω) with r ′ ∈ S, (2a) 

M s (r ′ ,ω)=−ˆn × E (r ′ ,ω) with r ′ ∈ S, (2b) 

wherein ˆn stands for the outward-pointing unit normal 

vector on the Huygens surface S. The far-fields EF and 

HF are obtained from the radiation vector (1) by 

ω 

c r 

e−j 0 

EF (r,ω)=−jμ0ω F (ˆr,ω) , (3a) 

4πr 

HF (r,ω)=−j ω 

ω −j c r 

e 0 

ˆr × F (ˆr,ω) (3b) 

c0 4πr 

and the directive gain [8] is given by 

2 μ0ω F (ˆr,ω) 

D (ˆr,ω)= 

8πc0 

2 

2 , (4) 

P (ω) 

wherein μ0 describes the vacuum permeability. In (4), 

the total radiated power P (ω) is determined from 

P (ω) = 1 

2 ℜ 

 

ˆn × E (r 

S 

′ ,ω) · H (r ′ ,ω)dS ′ 

 

. (5) 

III. FE MODEL AND MULTI-POINT MOR METHOD 

FE analysis of the near fields of a phased array of L 

antennas leads to a linear system of the form 

(A0 + ωA1+ω 2 L 

A2) x (p) =ω up (p) bp, (6a) 

p=1 

y (p) = C0 + ω −1 

C1 x (p) , (6b) 

P (p) =ω −1 x ∗ (p) Dx (p) . (6c) 

Herein, A0, A1, A2 ∈ CN×N are the stiffness, damping, 

and mass matrices, respectively, x denotes the solution 

vector in terms ofE, p = (θs,φs,ω) ∈ R3 the parameter 

vector, and N the dimension of the FE system. The 

output vector y ∈ C6H holds the electric and the magnetic 

near-field values E and H, respectively, sampled 

at H points on the Huygens surface S. In (6b), the 

matrices C0 ∈ C6H×N and C1 ∈ C6H×N carry out 

the sampling process and magnetic field computation on 

S. Furthermore, the Hermitian matrix D ∈ CN×N of the 

bilinear form (6c) represents the computation of the total 

radiated power (5). It can be seen that the system matrix 

of (6a) depends on the angular frequency ω only, while 

the RHS also depends on the steering angles θs and φs. 

Note that the RHS is constructed by a superposition of 

L linearly independent vectors bp ∈ CN with parameterdependent 

weights up (p). 

To obtain the near-fields vector y and the total radiated 

power P , the large-scale system (6a) has to be solved for 

each parameter vector p of interest. Our goal is to bypass 

this time-consuming procedure. Since the FE system (6) 

exhibits affine parameter dependence [1], it is well-suited 

for projection-based MOR. The idea is to approximate 

the FE solution x (p) in a low dimensional subspace 

according to 

x (p) ≈ ˆx (p) =V˜x (p) (7) 


with ˆx ∈ CN , ˜x ∈ Cn , V ∈ CN×n , and n ≪ N for 

all p ∈ D. Here, D denotes the considered parameter 

domain. For numerical stability, the columns of the trial 

matrix V are chosen to be orthogonal. Substituting the 

approximation (7) for x (p) in (6a) and testing with V∗ leads to the ROM: 

2 

ω q L 

Ãq ˜x (p) =ω up (p) ˜ bp, (8a) 

q=0 

ˆy (p) = 

p=1 

1 

ω −r Cr˜x ˜ (p) , (8b) 

r=0 

ˆP (p) =ω −1˜x ∗ (p) ˜ D˜x (p) , (8c) 

wherein the reduced matrices and vectors are given by 

Ãq = V ∗ AqV with Ãq ∈ C n×n , (9) 

˜bp = V ∗ bp with bp 

˜ ∈ C n , (10) 

˜Cr = CrV with Cr 

˜ ∈ C 6H×n , (11) 

˜D = V ∗ DV with D˜ n×n 

∈ C . (12) 

Using a multi-point (MP) MOR method, the trial matrix 

V is constructed from FE solutions on a discrete set 

De ⊂Dof expansion points pi ∈ De such that 

range V =span{x (p1) ,...,x (pn)} . (13) 

As long as n ≪ N, the ROM (8) can be solved much 

more efficiently than the original system (6). Thus, the 

computational costs for determining both the near-field 

values and the total radiated power can be kept very low. 

IV. SELF-ADAPTIVE EXPANSION POINT SELECTION 

The residual r of ˆx with respect to (6a) takes the form 

r (p) = 

2 

ω q L 

(AqV) ˜x (p) − ω up (p) bp. (14) 

q=0 

Thus, the computation of its 2-norm, 

r (p) 2 

2 = 

2 2 

p=1 

ω 

q1=0 q2=0 

q1+q2 ˜x (p) ∗ V ∗ A ∗ q1Aq2V ˜x (p) 

− 2ℜ 

+ ω 2 

L 

p=0 q=0 

L 

p1=0 p2=0 

2 

ω q+1 b ∗ pAqV 

˜x (p) 

L 

up1(p)up2(p) b ∗ p1bp2 

, (15) 

just involves matrices and vectors of the ROM dimension 

n ≪ N and is therefore very fast. This motivates the use 

of a residual-based error indicator ρ(Dds) in the pointplacement 

strategy: 

ρ(Dds) = max r(p)2 . (16) 

p∈Dds 

Herein, Dds stands for a dense sampling of the considered 

domain.

Algorithm 1 Conventional Greedy Algorithm. 

Given: Dds, p1 ∈ Dds, and ɛ. 

n =0. {Initialize ROM dimension.} 

repeat 

n ← n +1. 

ωc = ω(pn). 

Compute LU factorization of A(ωc). 

Determine x (pn) by forward-back substitution. 

Construct ROM by (8a). 

Compute residual r(p) 2 for all p ∈ Dds. 

Place expansion point pn+1 using (17). 

until ρn(Dds)

Let P denote the number of far-field look angles for 

which (23) is to be evaluated. It can be seen that, although 

the solution ˜x (p) of the ROM (8) is used, the computational 

effort for merely one single operating-point p ∈ D 

is still of complexity O (PH + Hn), i.e., the far-field 

computation itself is expensive, too. Our solution to this 

problem is to adopt an idea from [2] and employ the 

EI method [3], [4] to construct an affine decomposition 

of the exponential function (22). The offline part of this 

method uses a greedy strategy to determine a set of M 

basis functions {qm} M 

m=1 , interpolation points {r′ m} M 

m=1 

and parameter values {d ′ m} M 

m=1 such that the interpolant 

ê (d, r ′ ) defined by 

ê (d, r ′ M 

)= αm (d) qm (r ′ ) (24) 

m=1 

approximates (22) for all (r ′ , d) ∈ S ×M. Having 

constructed the interpolation matrix 

⎡ 

⎤ 

⎢ 

BM = ⎣ 

. .. 

⎥ 

⎦ (25) 

q1 (r ′ 1) 

. 

q1 (r ′ M ) ... qM (r ′ M ) 

offline, the parameter-dependent coefficients 

{αm (d)} M 

m=1 are obtained online, by solving the 

lower triangular system 

⎡ ⎤ ⎡ 

α1 (d) e (r 

⎢ 

[BM ] 

. ⎥ ⎢ 

⎣ . ⎦ = ⎣ 

αM (d) 

′ ⎤ 

1, d) 

. ⎥ 

. ⎦ . (26) 

, d) 

e (r ′ M 

Substituting the empirical interpolant (24) for e (d, r ′ ) in 

(23) results in 

Ix (˜x (p) , ê (d, r ′ ) ,ω)=ω −1 M 

ΔS αm (d) 

H 

h=1 

m=1 

qm (r ′ h) ˆn (r ′ h) × ˜ C1 (r ′ h) ˜x (p) . (27) 

Under the precondition that the sampling points on the 

Huygens surface S remain constant for all M steps of the 

EI method, the online part of (24) can be implemented 

such that it takes only O (M) operations. Thus, the 

computational efforts for computing P far-field values 

by (27) for a given operating-point p ∈ D are only of 

order O (PM + Mn). Since, in practice, M ≪ H, the 

costs of the far-field computation are greatly reduced. 

VI. NUMERICAL RESULTS 

In the following, we consider the FE model of a 

dual-polarized tapered slot antenna array (TSAA) [6] 

consisting of L = 40 antennas, whose geometry is 

depicted in Fig. 1. The frequency band is given by 

f ∈ [2, 4] GHz, and the scan angles of interest are in 

the range of (θs,φs) ∈ 0, π 

2 

3 × [0, 2π) rad .Weuse 

#Dds = 17040 training points in the offline part of 

the self-adaptive multi-point method of Section IV and 

construct the ROM (8) by both the conventional greedy 

algorithm and the new FSG approach of Section IV-B. 


Fig. 1. Geometry of the TSAA [6]. Dimensions: length l =8cm, 

width w = 8 cm, height h = 7 cm, and displacement of adjacent 

antennas s =2cm. 

Maximum norm: local error indicator 

10 0 

10 −2 

10 −4 

10 −6 

10 −8 

Conventional greedy method 

FSG method 

50 100 150 200 250 300 350 

ROM dimension n 

Fig. 2. Normalized error indicator (16) versus ROM dimension n 

for the conventional and the new FSG method. Circles mark changes 

in expansion-point frequency in the FSG method, requiring matrix 

factorization. 

A. Properties of FSG method 

Fig. 2 presents the behavior of the normalized error 

indicator (16) as a function of ROM dimension n. It 

can be seen that the standard method achieves nearly 

constant rates of convergence, whereas the FSG approach 

converges rather slowly during the early stages of the 

iteration. This behavior is expected because, early on, the 

frequency sampling of the FSG method is very poor. On 

the other hand, the standard procedure must factorize the 

FE matrix at each iteration, whereas the FSG method requires 

factorizations only when the expansion frequency 

changes, i.e., at the iterations marked by circles in Fig. 2. 

Thus, to compare overall computational efficiency, we 

have measured computing times for the same threshold 

ɛ of the error indicator. Table I presents the results. It 

can be seen that, depending on the threshold level, the 

proposed FSG method is 5 to 7.5 times faster.

Relative error e n 

TABLE I 

TSAA: COMPUTING TIME FOR ROM CONSTRUCTION (8) 

Residual Time t ∗ Speed-up Dimension n 

threshold ɛ Alg. 1 Alg. 2 factor Alg. 1 Alg. 2 

2.7 e−3 92.16 h 18.34 h 5.025 153 285 

4.6 e−5 164.98 h 24.25 h 6.803 265 333 

1.3 e−6 221.32 h 29.46 h 7.513 346 363 

∗ MATLAB code on Intel(R) Xeon(R) E5620 CPU at 2.40 GHz. 

10 0 

10 −2 

10 −4 

10 −6 

10 −8 

Conventional greedy method 

FSG method 

50 100 150 200 250 300 350 

ROM dimension n 

Fig. 3. Relative error in near-fields (28) versus ROM dimension n 

for the conventional approach and the FSG method. Parameter: p = 

( π π 

rad, − rad, 3.645 GHz) /∈ Dds. 

4 6 

B. Error in near-fields 

Our measure for the error in the near-fields at a given 

parameter vector p is the relative error e(p) defined by 

en (p) = x (p) − ˆxn (p)2 . 

x (p)2 (28) 

To investigate the convergence behavior of the 

projection-based MOR method, we choose a representative 

parameter vector, p = π π 

4 rad, − 6 rad, 3.645 GHz /∈ 

Dds, and evaluate (28) as a function of ROM dimension 

n. Fig. 3 shows that both the conventional approach 

and the FSG method exhibit exponential convergence. 

C. Error in far-fields 

The following tests are based on #Mds = 28380 

training points in the offline part of the EI method. The 

considered look angles are in the range of (θ,φ) ∈ 

π 

2 

0, × [0, 2π) rad . 

2 

We first investigate the error of the empirical interpolant 

êm (d, r ′ ) of (24) with respect to the true value 

of the exponential function e (d, r ′ ) of (22). For this 

purpose, we choose a representative parameter vector, 

d =(3.789 GHz, − π π 

4 rad, 5 rad) /∈ Mds and monitor the 

relative error em(d), 

 

 

 

 

em (d) = max 

 

 

, (29) 

r ′ ∈Sh 

e (d, r ′ ) − êm (d, r ′ ) 

e (d, r ′ ) 

as a function of the number of EI coefficients m. The 

results shown in Fig. 4 demonstrate that the EI method 

leads to exponential convergence. 


Relative error e m 

10 2 

10 0 

10 −2 

10 −4 

10 −6 

10 

0 200 400 600 800 

−8 

Number of coefficients m 

Fig. 4. Relative error in exponential function (29) versus number of 

EI coefficients m. Parameter: d = 3.789 GHz, − π 

4 

rad, π 

5 rad . 

TABLE II 

AVERAGE ERROR IN DIRECTIVE GAIN. 

Steering angles 

(θs,φs) 

Average error eD (30) 

2.57 GHz 3.30 GHz 3.95 GHz 

( π π 

rad, 4 2 rad) 1.2 × 10−3 1.7 × 10−3 2.6 × 10−3 ( π π 

rad, 4 4 rad) 1.1 × 10−3 1.8 × 10−3 2.8 × 10−3 ( π 

6 rad, 0 rad) 1.3 × 10−3 1.9 × 10−3 3.3 × 10−3 ( π π 

rad, − 3 3 rad) 2.2 × 10−3 2.0 × 10−3 3.1 × 10−3 In our final test, we consider the error in radiation 

pattern of the phased antenna array, by measuring the 

average error in directive gain eD, 

eD (p) = 1 

 

P 

D 

(p, dp) − 

 

P 

p=1 

ˆ 

D (p, dp) 

 

 

. (30) 

D (p, dp) 

Table II presents error values for 12 different parameter 

vectors p /∈ Dds, corresponding to the far-field plots in 

Fig. 5 – Fig. 7. It can be seen that the results of the 

suggested MOR approach are in very good agreement 

with reference data. Computational parameters for this 

test can be found in Table III and Table IV. 

D. Overall runtime performance 

Fig. 5 – Fig. 7 show three-dimensional radiation patterns 

of the TSAA for different operating frequencies 

and four steering angles per frequency. Computational 

data of the original FE model and the ROM are given in 

Table III and Table IV, respectively. Without doubt, the 

offline part of the algorithm leads to some one-time costs 

for constructing the ROM and the affine approximation 

to the NF-FF operator. However, once they are available, 

computing time for the near-fields improves by a factor of 

68000, compared to conventional FE analysis. Moreover, 

post-processing time for one radiation pattern based on 

P =4, 830 look angles reduces by a factor of 12. Thus, 

the total speed-up factor for computing one near-field 

solution plus the corresponding far-field pattern is 910. 

VII. CONCLUSIONS 

An efficient two-step MOR method for computing 

the far-field patterns of phased antenna arrays has been

(a) θs = π 

4 

rad, φs = π 

2 

π 

π 

rad. (b) θs = rad, φs = 4 4 rad. 

(c) θs = π 

π 

π 

rad, φs =0rad. (d) θs = rad, φs = − 6 3 3 rad. 

Fig. 5. Radiation patterns of the TSAA at f =2.57 GHz, determined 

by the two-step MOR method using P =4, 830 look angles. 

(a) θs = π 

4 

rad, φs = π 

2 

π 

π 

rad. (b) θs = rad, φs = 4 4 rad. 

(c) θs = π 

π 

π 




presented. Thanks to the new FSG technique, evaluation 

times for a real-world example [6] improve by a factor 

of 5 to 7.5 over earlier MOR approaches, and by a factor 

of 910 compared to conventional FE analysis. 

REFERENCES 

[1] V. de la Rubia, U. Razafison, and Y. Maday, ”Reliable fast 

frequency sweep for microwave devices via the reduced-basis 

method”, IEEE Trans. Microw. Theory Techn., vol. 57, pp. 2923- 

2937, Dec. 2009. 

[2] M. Fares, J. S. Hesthaven, Y. Maday, and B. Stamm, ”The reduced 

basis method for the electric field integral equation”, J. Comput. 

Phys., vol. 230, pp. 5532-5555, 2011. 


(a) θs = π 

4 rad, φs = π 

2 rad. (b) θs = π 

4 rad, φs = π 

4 rad. 

(c) θs = π 

π 

π 




TABLE III 

COMPUTATIONAL DATA OF ORIGINAL FE MODEL OF TSAA. 

Parameters: θs = π 

π 

rad, φs = rad, f = 2.57 GHz. 

4 2 

FE dimension N 2, 553, 439 

Number of near-field points H 12, 800 

Number of look angles P 4, 830 

Time for solving FE system (6a) 2192.4 s∗ Time for computing radiation pattern 28.9 s∗ ∗ MATLAB code on Intel(R) Xeon(R) E5620 CPU at 2.40 GHz. 

TABLE IV 

COMPUTATIONAL DATA OF REDUCED-ORDER MODEL OF TSAA. 

Parameters: θs = π 

π 

rad, φs = rad, f = 2.57 GHz. 

4 2 

ROM dimension n 300 

Number of EI coefficients m 350 

Number of look angles P 4, 830 

Offline time for generating ROM (8) 20.36 h∗ Offline time for EI method 33.97 h∗ Online time for solving ROM (8a) 0.0321 s∗ Online time for radiation pattern 2.4087 s∗ ∗ MATLAB code on Intel(R) Xeon(R) E5620 CPU at 2.40 GHz. 

[3] M. Barrault, Y. Maday, N. C. Nguyen, and A. T. Patera, ”An 

’empirical interpolation’ method: application to efficient reducedbasis 

discretisation of partial differential equations”, C. R. Acad. 

Sci. Paris, Ser. I 339, pp. 667-672, 2004. 

[4] M. A. Grepl, Y. Maday, N. C. Nguyen, A. T. Patera, ”Efficient reduced 

basis treatment of nonaffine and nonlinear partial differential 

equations”, M2AN Math. Model. Numer. Anal. 41, pp. 575605, 

2007. 

[5] P. Binev, A. Cohen, W. Dahmen, R. DeVore, G. Petrova, and P. 

Wojtaszczyk, ”Convergence rates for greedy algorithms in reduced 

basis methods”, SIAM J. Math. Anal., vol. 43, pp. 1457-1472, 

2011. 

[6] T.-H. Chio, and D. H. Schaubert, ”Parameter study and design of 

wide-band widescan dual-polarized tapered slot antenna arrays”, 

IEEE Trans. Antennas Propag., vol. 48, pp. 879-886, June 2000. 

[7] E. J. Rothwell and M. J. Cloud, ”Electromagnetics”, CRC Press, 

2009 

[8] S. J. Orfanidis, ”Electromagnetic Waves and Antennas”, 

http://www.ece.rutgers.edu/ orfanidi/ewa.


Nanoparticle device for biomedical and 

optoelectronics applications 

R. Iovine, L. La Spada and L. Vegni 

Department of Applied Electronics, University of Roma Tre, Via della Vasca Navale 84, 00146 Rome, Italy 

E-mail: riovine@uniroma3.it 

Abstract—In this contribution a nanoparticle device, operating in the visible regime based on the Localized Surface Plasmon 

Resonance (LSPR) phenomenon, is presented. The nanoparticle electromagnetic properties are evaluated by a new analytical 

model and compared to the results obtained by numerical analysis. A near-field enhancement is obtained by arranging the 

nanoparticles in a linear array. Analytical formulas, describing such enhancement, are presented. The structure can find 

application for medical diagnostics and optoelectronics applications. 

Index Terms— LSPR, Medical diagnostics, Nanoparticle, Near-field Enhancement, Optoelectronics Applications 


In the last few years, several researches have paid 

attention to gold nanoparticles optical properties relate to 

the interaction of these structures with electromagnetic 

field at Visible (VIS) and Near Infrared Region (NIR) 

[1]. 

When the electromagnetic field interacts with small metal 

particles the conduction electrons start oscillating 

collectively. This phenomenon is now well known and 

called Localized Surface Plasmon (LSP) [2]. If the 

frequency of the incident field matches the natural 

frequency oscillation of the electrons cloud the resonance 

condition is established with a strong dependence on the 

shape, size, composition of the nanoparticles as well as 

on the dielectric properties of the background 

environment [3]. 

The analytical closed form electromagnetic solution to 

evaluate the electromagnetic behavior of metal 

nanoparticles exists only for the spherical shapes [4]. The 

possibility to predict the electromagnetic properties of 

different kind of shapes is now very important due to the 

fact that the progress in nanofabrication technology 

allows to realize many shapes of particles [5] suitable for 

several application field such as biomedical sensing [6] 

and thin film solar cells [7]. 

For example in [8] the possibility to control the 

enhancement of the Surface Enhanced Raman Scattering 

(SERS) using gold nanoparticles in the field of diagnostic 

oncology is reported. In [9] the possibility to use gold 

nanoparticles to produce in an efficient way heat energy 

from absorbed light energy that may be employed for 

selective PhotoThermal Therapy (PTT) is referred. 

The aim of this contribution is to propose the design of a 

nanostructure device consisting in a gold linear chain 

array of nanocubes, deposited on a silica substrate. 

For the cube particle a new analytical quasi static model 

describing its resonant behavior in terms of absorption 

and scattering cross section is presented. The results 

obtained by the analytical model are compared to the 

other ones performed through proper full-wave 

simulations [10] and by using the boundary integral 

method approach [11]. 

The electromagnetic behavior of the device is evaluated 

for different inter-particle distance. In particular, the far 

field properties and the near electric field distribution are 

numerically obtained and the performances of the 

structure are analyzed for possible optoelectronics 

applications (design of absorbing layers) and for 

biosensing applications (refractive index measurements). 

II. QUASI STATIC ANALYTICAL MODEL FOR THE 

CUBE PARTICLE 

In general the nanoparticles have a size smaller compared 

to wavelength (e.g. at optical frequencies) so, it is 

possible to assume that all the conduction electrons in a 

nanoparticle see the same field at a given time (quasi – 

static approximation). 

Figure 1: Scheme of interaction between electromagnetic field and 

small particles compared to wavelength. 

The displacement of the electrons by incident 

electromagnetic field induces a dipolar charge separation 

(positive nuclei – free electrons) generating a restoring 

force which conflicts with incident field. The electron 

position is determined by the following equation: 

 

 

(1) 

 

where is the electron mass, is the electron damping 

coefficient and is the restoring coefficient. 

The relation (1) is a second order inhomogeneous 

differential equation with the following solution for 

harmonic excitation: 

 

 

(2) 

where 

is the natural frequency of the system.

This model is equivalent to a classical mechanical 

oscillator and represents a good physical interpretation to 

understand the Localized Surface Plasmon Resonance 

(LSPR) phenomenon. The resonance condition is 

established for and the denominator of (2) tends 

to zero and the coefficient and are very difficult to 

evaluate and are implicitly related to 

geometry/electromagnetic properties of the particles and 

permittivity value of the dielectric environment. 

However, exploiting the limit of electrically small 

particles it is possible to evaluate the resonant behavior of 

the cube nanoparticle in accurate way. In order to study 

such electromagnetic properties, in terms of scattering 

and absorption cross-section, the following assumptions 

will be done: 

the particle is homogeneous and the surrounding 

material is a homogeneous, isotropic and nonabsorbing 

medium. 

The impinging plane wave has the electric field E 

parallel and the propagation vector k perpendicular 

to the nanoparticle principal axis, as depicted in 

Figure 2. 

Figure 2: Geometrical sketch of the gold nanocube particle. 

Under such conditions, we can relate the macroscopic 

nanoparticle properties to the polarizability of the 

nanoparticle. 

It is well known that [12], in case of an arbitrary shaped 

particle, its polarizability can be expressed as: 

(3) 

where is the volume of the particle, the surrounding 

dielectric environment permittivity, the inclusion 

dielectric permittivity and is the depolarization factor. 

The nanoparticle polarizability strongly depends on the 

inclusion geometry, its metallic electromagnetic 

properties , and the permittivity of the surrounding 

dielectric environment . In particular, the factor of 

a nanoparticle plays a critical role in the polarizability 

resonant behaviour for the LSPR strength. 

Starting from [12], it is possible to develop new 

analytical closed-form formulas for the scattering and 

absorption cross-section of the aforementioned particles. 

The general corresponding expressions read, respectively: 

 

 

(4) 


where is the wavenumber, is the 

wavelength and is the refractive index of the 

surrounding dielectric environment. Im stands for 

"Imaginary part". 

By considering the electric field polarization of the 

impinging plane wave, the absorption cross-section reads 

[13]: 

 

 

 

 

where is: 

 

 

 

 

 

III. BOUNDARY ELEMENT METHOD APPROACH 

Under quasi - static approximation the electric field can 

be expressed through the scalar potential as: 

(5) 

(6) 

(7) 

For homogeneous isotropic frequency-dispersive media 

can be determined easily from the Laplace equation: 

 

(8) 

In fact, by assuming an impulsive source the solution of 

(8) is well known through the Green function 

as: 

 

 

 

 

(9) 

where and are the position vector and source vector, 

respectively. 

However, if we have an inhomogeneous medium such as 

a nanoparticle embedded in a dielectric environment 

(Figure 3) the solutions (9) are also valid but need to be 

satisfied by appropriate boundary conditions. 

Figure 3: Gold nanocube particle embedded in a dielectric environment.

In [11] it is possible to evaluate the scalar potential for 

the inhomogeneous medium: 

 

 

 

(10) 

by adding an artificial charge distribution at the boundary 

of discontinuity, determined from the continuity of 

the tangential electric field and normal component of the 

dielectric displacement [14]. 

The expression (10) can be converted from boundary 

integrals to bounday elements. Following the procedure 

reported in [15] it is possible to discretize the particle 

boundary into small surface by assuming that surface 

charges are located at the center of the surface element. 

In this way, it is possible to obtain numerically for a 

given external excitation the surface charge density 

and, consequently, the near electric field distribution and 

the far field properties in terms of absorption, scattering 

and extinction cross sections. 

IV. RESULTS FOR THE SINGLE PARTICLE 

The electromagnetic properties for the cube particle are 

evaluated using the quasi static analytical model, 

boundary element method (BEM) approach [15] and are 

compared to the results obtained with full-wave 

numerical simulations [10]. 

We have assumed that the structure is excited by an 

impinging plane wave as shown in Figure 2. In addition: 

for the cube particle, experimental values [16] of the 

complex permittivity function have been inserted; 

the surrounding dielectric medium is vacuum. 

Far field properties in terms of absorption and scattering 

cross - section are shown in Figure 4 and Figure 5. 

Figure 4: Absorption and scattering cross section spectra obtained with 

the analytical model (l=50 nm). 


Figure 5: Absorption and scattering cross section spectra obtained 

through full-wave simulations (l=50nm). 

There is a good agreement among the results obtained 

with the analytical model (Figure 4) and full-wave 

simulations (Figure 5). 

Full-wave simulations are also compared with the 

numerical results obtained with the BEM as shown in 

Figure 6. 

Figure 6: Comparison between extinction spectra obtained with BEM 

and full-wave simulations (l=50nm). 

The difference among the results shown in Figure 6 could 

be associated to the different discretization of the edge of 

the particle with these two approaches. 

Near electric field distribution is obtained through fullwave 

simulation as depicted in Figure 7. 

Figure 7: Near electric field distribution for a single nanocube particle 

(l=50nm). The incident electric field amplitude is 1 V/m. 

In Figure 7 is clearly shown the dipolar charge repartition 

according to the quasi-static approach. 

V. LSPR DEVICE 

To enhance the mechanism of the LSPR it is possible the 

use of inter-coupling among nanoparticles. Such effect

originates from the charge induction among two or more 

nanoparticles which interact stronger as they get closer to 

each other [17]. 

To use this enhancement mechanism we propose a 

structure consisting in a linear chain of gold nanocubes 

deposited on a silica substrate, excited by a plane wave as 

depicted in Figure 8. 

Figure 8: Linear chain of gold nanocubes on silica substrate with 

a=500nm, b=100nm, l=50 nm, l/8


the full-wave simulations (d=l= 50nm). 

VII. BIOSENSING APPLICATION OF THE DEVICE 

By using very small inter-particle distance among the 

nanoparticles it is possible to obtain high scattering and 

low absorption efficiencies (Figure 12, TABLE I). These 

properties are very important for biosensing applications. 

In fact high absorption efficiency could heat the 

biological sample invalidating medical diagnosis. 


the full-wave simulations (d=l/8= 6.25nm). 

For biosensing application we suppose that the device 

(grey) is in direct contact with the biological sample 

under test (green) as depicted in Figure 13. The sensor 

behavior is related to the effective refractive index 

variation of the overall system "LSPR device - biological 

compound". 

Once the biological compound is placed on the device, 

the system "sensor-biological compound" is illuminated 

by an optical electromagnetic field (Figure 13). The 

detected signal has a new frequency position and its 

magnitude and amplitude width are both dependent on 

the different characteristics of the biological compound. 


Figure 13: The sensing system operation scheme 

The biological sample used to test this device is an insilico 

replica with values or Refractive Index (RI) taken 

from the literature. In particular the RI values of rat 

mammary adipose and tumor tissue have been considered 

[18]. These data (TABLE II) were acquired using an 

interferometric imaging system (Optical Coherence 

Tomography - OCT technique). 

TABLE II 

Tissue type Refractive 

index 

(mean value) 

Tumor 1.39 

Adipose 1.467 

The data show that a difference exists between the RI of a 

adipose tissue and that of tumor tissue. 

The electromagnetic sensor response is evaluated in terms 

of extinction cross-section through full-wave simulations 

[10] as depicted in Figure 14. 

Figure 14: Extinction spectra for rat mammary cancer (RI=1.39) and 

adipose tissue (RI=1.467). 

As shown in Figure 14 the resonant peak shifts from 634 

nm for a tumor tissue to 650 nm for a regular adipose 

tissue. Sensitivity is evaluated as S=Δλ/Δn expressed in 

nm/RIU (Refractive Index Unit). In this case sensitivity 

reached 207nm/RIU.

Near electric field distribution obtained for this sensing 

platform (Figure 15) is less concentrated compared to the 

other one obtained for d=l (Figure 10). 

Figure 15: Near electric field distribution for d=l/8= 6.25 nm. The 

incident electric field amplitude is 1 V/m. 

This result is in accord to the prevailing scattering 

phenomenon (TABLE I). 

VIII. CONCLUSION 

In this paper a nanostructure device operating in the 

visible regime was proposed. The device consisting in a 

gold linear chain array of nanocubes, deposited on a silica 

substrate. In this way a near-field enhancement is 

obtained and analytical formulas to describe this 

phenomenon are presented. 

For the single nanoparticle good agreement among 

analytical results and numerical solutions was achieved. 

Exploiting electromagnetic properties of the device it was 

shown that the proposed structure could be successfully 

used as a biomedical sensor or as an optoelectronic 

device. 

[1] 

REFERENCES 

A. Moores and F. Goettmann, "The plasmon band in noble metal 

nanoparticles: an introduction to theory and applications," New 

Journal of Chemistry, vol. 30, pp. 1121-1132, 2006. 

[2] E. Hutter and J.H. Fendler, "Exploitation of Localized Surface 

Plasmon Resonance," Advanced Materials, vol. 16, pp. 1685- 

1706, 2004. 

[3] L.J. Sherry, S.-H. Chang, G.C. Schatz and R.P. Van Duyne, 

"Localized Surface Plasmon Resonance Spectroscopy of Single 

Silver Nanocubes," Nano Lett., vol. 5, pp. 2034–2038, 2005. 

[4] G. Mie, "Contributions to the optics of turbid media, particularly 

of colloidal metal solutions," Ann. Phys., vol. 25, pp. 377-445, 

1908. 

[5] M. Tréguer-Delapierre, J. Majimel, S. Mornet, E. Duguet and S. 

Ravaine, "Synthesis of non-spherical gold nanoparticles," Gold 

Bulletin, vol. 41, pp. 195-207, 2008. 

[6] W. Cai, T. Gao, H. Hong and J. Sun, “Application of gold 

nanoparticles in cancer nanotechnology,” Nanotechnology, 

[7] 

Science and Application, vol. 1, pp. 17-32, 2008. 

K.R. Catchpole and A. Polman, “Plasmonic solar cells,” Optics 

Express, vol. 16, pp. 21793-21800, 2008. 

[8] D.S. Grubisha, R.J. Lipert, H.-Y. Park, J. Driskell and M.D. 

Porter, "Femtomolar Detection of Prostate-Specific Antigen: An 

Immunoassay Based on Surface - Enhanced Raman Scattering and 

Immunogold Labels," Anal. Chem., vol. 75, pp. 5936-5943, 2003. 

[9] S. Kessentini, D. Barchiesi, T. Grosges and M. Lamy de la 

Chapelle, "Selective and Collaborative Optimization Methods for 

Plasmonics: A Comparison," PIERS Online, vol. 7, pp. 291-295, 

2011. 

[10] CST Computer Simulation Technology, www.cst.com 


[11] U. Hohenester and J. Krenn, "Surface plasmon resonances of 

single and coupled metallic nanoparticles: A boundary integral 

method approach," Phys. Rev. B, vol. 72, pp.195429, 2005. 

[12] A. Sihvola, "Electromagnetic Mixing Formulas and 

Applications," The Instution of Engineering and Technology - 

London, 2008. 

[13] L. La Spada, R. Iovine and L. Vegni, "Nanoparticle 

Electromagnetic Properties for Sensing Applications," Advances 

in Nanoparticles, vol. 1, pp. 9-14, 2012. 

[14] F.J. Garcìa de Abajo, "Retarded field calculation of electron 

energy loss in inhomogeneous dielectrics," Physical Review B, 

vol. 65, pp. 115418.1-115418.17, 2002. 

[15] U. Hohenester and A. Trugler, "MNPBEM- A Matlab toolbox for 

the simulation of plasmonic nanoparticles," Computer Physics 

Communications, vol. 183, pp. 370-381, 2012. 

[16] P.B. Johnson and R.W. Christy, “Optical Constants of the Noble 

Metals,” Phys. Rev. B, vol. 6, pp.4370-4379, 1972. 

[17] T. Chung, S.-Y. Lee, E.Y. Song, H. Chun and B. Lee, "Plasmonic 

Nanostructures for Nano-Scale Bio - Sensing," Sensors, vol. 11, 

pp. 10907-10929, 2011. 

[18] A.M. Zisk, E.J. Chaney and S.A. Boppart, "Refractive index of 

carciogen-induced rat mammary tumours," Phys. Med. Biol., vol. 

51, pp. 2165-2177, 2006.


Validation of measurements with conjugate heat 

transfer models 

M. Schrittwieser 1, 2 , O. Bíró 1, 2 , E. Farnleitner 3 , and G. Kastner 3 

1 Institute for Fundamentals and Theory in Electrical Engineering, Inffeldgasse 18, A-8010 Graz, Austria 

2 Christian Doppler Laboratory for Multiphysical Simulation, Analysis and Design of Electrical Machines, 

Innfeldgasse 18 A-8010 Graz, Austria 

3 Andritz Hydro GmbH, Dr. Karl- Widdmann- Strasse 5, A-8160 Weiz, Austria 

E-mail: schrittwieser@tugraz.at 

Abstract— The paper presents a comparison of thermal measurements on three stator duct models of an electrical machine. 

These models differ from each other by the slot section components. The measurements show the advantages and 

disadvantages of different variations. In order to study the measurement results in detail, a comparison with Computational 

Fluid Dynamics (CFD) was conducted, where it was useful to apply the Conjugate Heat Transfer (CHT) method, because it 

takes the convection and conduction into account. Therefore the conditions for the numerical heat transfer model can be 

determined more realistically, especially for the temperature rise in the solid domains caused by losses. 

Index Terms— Fluid Flow, Measurement, Stators, Thermal Analysis 


Hydro generators located in water power plants 

produce electric power in the range of more than 10 

MVA. The arising losses lead to a temperature rise in the 

electrical machine. The temperature rise is caused by 

copper, hysteresis, eddy current and mechanical losses 

during the generator operating. The heat has to be 

discharged to ensure the operating characteristics and this 

is the purpose of the cooling scheme. For designing the 

cooling of a generator, thermal and air flow networks are 

mostly used. Therefore the used parameters have to be 

established theoretically, by measurements or by CFD. 

The temperature rise has to be handled by solving the 

energy equation with the focus on heat convection and 

heat conduction. The convective heat transfer coefficient 

(HTC) is one of the most important parameters of these 

networks and must be known accurately. Examples of 

networks are presented in [1] and [2]. 

In the last years several investigations have been 

carried out on the topic of heat transfer, especially for low 

power electrical machines. Two different methods have 

emerged to get information about the HTC. One uses 

thermal resistances, defined with the aid of temperatures 

gained by measurements [3], [4] and [5]. The other 

employs CFD calculations combined with measurements 

[6], [7] and [8]. 

The convective HTC has been calculated for large 

numerical models with CFD at different parts in [9] and 

[10] where the numerical effort is very high due to the 

large number of nodes in the model. A special set-up of 

boundary conditions has been tried to reduce the section 

to be analyzed for comparable results with special 

attention payed to the rotor stator interaction. Only the 

fluid material properties are significant in these CFD 

simulations and the temperatures have been defined at the 

walls as a boundary condition from measurements. The 

refinement of the mesh near the wall for calculating an 

exact heat transfer is very important. An indicator of the 

mesh density is the dimensionless wall distance y + which 

should be about y 1 

+ ≤ [11]. The primary reason for this 

is that the HTC is a function of the dimensionless wall 

distance [12]. 

The heat transfer caused by conduction has been 

considered in several papers by the finite element method 

(FEM) [13], [14] and [15]. The advantage of CFD over 

FEM is the consideration of the actual wall heat transfer 

coefficient. The disadvantage of the CFD is that the 

losses cannot be considered, while FEM is capable of 

this. Therefore the copper and iron losses have to be 

implemented differently in CFD e.g. using the conjugate 

heat transfer (CHT) method. The sources can be defined 

in the solid domains and the material properties play an 

important role for the CHT solution. 

This paper presents a mutual validation of calorimetric 

measurements and a numerical calculation. The CHT 

method (fluid and solid heat transfer) is applied to a stator 

duct model. The losses have been defined as sources in 

the solid domains. The main objective is to evaluate the 

slot geometries with different winding assemblies. All 

three models have been measured at 5 flow rate points to 

pinpoint their thermal characteristics. 

II. MEASUREMENT 

A simplified model of a stator section has been under 

experimental investigation at the ANDRITZ Hydro. The 

main objective of the measurements has been to find and 

compare the thermal characteristics of different winding 

assemblies. 

Air has been used as cooling fluid for the experimental 

set-up. 

A. Investigated model 

The laboratory model and the cooling scheme are 

shown in detail in Fig. 1. The cooling fluid streams from 

the Inlet through the measuring nozzle (a) to the 

temperature probe (b) and from there through rectangular 

channels of wood (c) into the stator duct model (d). After 

heat exchange the warm air streams through wood ducts 

to an outlet channel, which contains resistance thermo 

elements (f). The outer surface of the model has been

insulated (e) for reduction of secondary heat flux. 

Fig. 1: Calorimetric measurement and experimental set-up of the stator 

laboratory model; (a) measuring nozzle, (b) Pt-100 temperature probe, 

(c) wood channels; (d) stator duct model; (e) insulation of the model and 

(f) resistance thermo elements 

B. Measuring physical parameters 

The measuring nozzle defines the volume flow rate Vin 

immediately in front of the model inlet. 

The fluid temperature T and density has been 

measured at the inlet and outlet of the stator duct model. 

These calorimetric measurement data allow calculating 

the heat flux after reaching steady state. The energy 

exchange occurs in the stator duct model. Therefore, it is 

important to calculate also the solid temperature and fluid 

temperature in the ducts. Fig. 2 shows the positions of the 

temperature probes in the iron domain. 

Fig. 2: Position of measurement probes in the iron; (a) 1 st stator core, (b) 

2 nd stator core and (c) heating rod 

The stator model consists of a section including 5 slots 

in circumferential direction and 3 ventilation ducts with 

distance bars between the laminated iron sheets in axial 

direction. The temperature has been measured in two 

stator cores. Therefore, fifteen Pt-100 resistance 

thermometers with 20 mm probe length have been 

positioned at each stator core. 

The heat sources have been simulated with heating 

rods positioned in the winding bars made of solid copper. 

The source has been induced with heating rods positioned 

in a hole in the middle of the copper bars, see Fig. 3. The 


length of the rod has been 100 mm, with a diameter of 6 

mm and a constant heat output. The upper and lower bars 

have been heated up to reach steady state. The heat output 

has been constant during the whole experiment. 

Fig. 3: Position of temperature measurement devices for the cooling 

fluid; solid temperature positions for measuring (a) copper temperature 

and (b) spacer temperature; position of (c) the heating rod 

Thereupon the temperatures have been measured in 

each copper bar with two NiCrNi thermocouples 60 mm 

in length. The temperature in the spacer has been 

measured by a Pt-100. 

C. Results of measurements 

Table I shows the measurement data obtained for the 5 

different operating points for each model under 

investigation. The temperature differences have been 

normalized by the fluid inlet temperature. 

Model A 

Model B 

Model C 

TABLE I 

CALORIMETRIC MEASUREMENT RESULTS 

Vin in Tout − TinTcopper 

−Tin 

m Tin 

Tin 

3 /s kg/m 3 

T − T 

iron in 

Tin 

0.080 1.130 0.12 1.39 0.27 

0.060 1.129 0.16 1.60 0.36 

0.040 1.138 0.26 1.99 0.55 

0.025 1.134 0.41 2.35 0.81 

0.015 1.133 0.68 2.96 1.30 

0.079 1.155 0.15 1.80 0.48 

0.061 1.154 0.21 2.04 0.60 

0.041 1.152 0.31 2.40 0.83 

0.025 1.154 0.51 2.94 1.22 

0.015 1.152 0.81 3.55 1.80 

0.078 1.155 0.15 1.80 0.51 

0.060 1.149 0.20 2.03 0.64 

0.041 1.147 0.31 2.34 0.85 

0.025 1.149 0.51 2.84 1.23 

0.015 1.151 0.81 3.39 1.77 

III. MODEL GEOMETRIES 

The measurement set-up has been implemented in 

ANSYS CFX [11]. 

The whole numerical model is shown in Fig. 4. The 

cooling scheme is the same as during the measurements 

i.e. the wood channels have also been modeled. Adiabatic 

walls have been defined at the top and the bottom of the

numerical model in z-direction. 

In addition to this simulation, a pperiodic 

boundary 

condition has been defined at the surfaaces 

normal to the 

x-direction. The goal of this is to reduce 

the section to be 

analyzed (less number of nodes togeether 

with smaller 

elements). 

Fig. 4: CHT model 

Fig. 5 visualizes the numerical statoor 

model in detail. 

For the calculation, one slot section hass 

been investigated 

due to the inlet condition, which is the same for each slot 

section as in the measurement. The wwinding 

assemblies 

are nearly the same, i.e. copper bars (d) and (e) with 

insulation (f), spacer (g) between the wwinding 

bars and, 

for positioning in radial direction, the sllot 

wedge (h). The 

iron (b) and (c) is located at the top aand 

bottom of the 

fluid (a). 

The difference in the models is thee 

contact between 

insulation and the iron. 

Fig. 5: Numerical stator model consists of the (a) fluid in the stator duct, 

(b) iron teeth, (c) iron yoke, (d) top copper bar, (e) bottom copper bar, 

(f) insulation, (g) spacer between bars and (h) slott 

wedge 

The following models differ from eacch 

other in the type 

of the winding assembly. There are diffferent 

options for 

mounting the winding, which will be eexplained 

in detail 

for each model. 

A. Model A 

This is a model with an air gap (white) between 

insulation and iron teeth, see Fig. 6. TThis 

air gap has a 

constant length. The cooling fluid caan 

stream in axial 

direction from one duct to another due tto 

the air gap. 

Fig. 6: Model A with air gap (white) 


B. Model B 

A ripple spring (white dashhed) 

is positioned on one 

side between the iron teeth annd 

the insulation instead of 

the air gap, as shown in Fig. 7. This ripple spring has had 

a corrugation in diagonal direcction. 

This corrugation has 

been smoothed along the surfaace. 

The implementation of 

this has been done with a thermmal 

resistance at the contact 

interface. This causes an asymmmetric 

energy transport and 

the fluxes are higher at the sidee 

without a ripple spring. 

Fig. 7: Model B with ripple spring (whhite 

dashed) 

C. Model C 

This model is similar to moodel 

A, with the difference 

that epoxy resin (white dotted) ) is present. This is shown 

in Fig. 8. In this case the air caan 

stream in axial direction 

through the air gap (white), likee 

in model A. 

Fig. 8: Model C with epoxy resin (whitte 

dotted) and air gap (white) 

IV. NUMERICAAL 

METHOD 

The material properties havee 

a significant influence on 

the numerical solution. In CFDD, 

the fluid properties play 

the most important role and tthe 

solid domains are not 

taken into account. Only the coonvection 

has an influence 

in such calculations and the connduction 

is not considered. 

The conjugate heat transferr 

method differs from the 

conventional CFD simulation iin 

the consideration of the 

heat conduction in the energyy 

equation. Therefore, the 

thermal conductivity has to bbe 

known and defined for 

each medium in the CFD code [16]. 

A. Turbulence Model 

Computational Fluid Dynammics 

uses the Finite Volume 

Method for solving the transporrt 

equations: 

∂ ρ ∂ρ 

u j 

+ 

∂t ∂x 

j 

= 0 

(1) 

∂ρu ∂ρuu 

i i j ∂ p ∂ρτ 

ij 

+ = + + ρ fi 

∂t ∂xj∂ x xi ∂x 

j 

(2) 

∂ρet ∂ρue i t 

+ 

∂t ∂x ∂uip 

∂ ∂uiτij ∂qj 

=− + + ρuf 

i i + + Q(3) 

∂x 

∂x ∂x 

j j 

j j 

These equations can be solveed 

for laminar flows. If the 

velocity and all other parametters 

vary in a random and 

chaotic way, the regime is calleed 

turbulent [17]. For most 

problems, it is unnecessary to resolve the detailed 

turbulent fluctuations and it is sufficient to calculate the 

time averaged properties of the flow. Therefore, the

Reynolds Averaged Navier Stokes (RANS) equations 

have been used. The Reynolds Stress Tensor is another 

unknown variable and further equations must be defined 

for the solution to calculate the unknown parameters [18]. 

In the present case the Shear Stress Transport (SST) 

turbulence model [19] has been used. The advantage of 

the SST turbulence model is that it combines the 

advantages of the k- turbulence model in the free stream 

and the advantage of the k- turbulence model near the 

wall [20], [21]. 

B. Model Configuration and Boundary Conditions 

The mass flow rate and the inlet temperature have been 

defined as measured, see Table I. The pressure at the 

outlet has been defined as ambient pressure. 

The heat output from the heating element has been 

defined on a length of 100 mm in the middle of the 

copper bars with a constant value gained from the 

measurement. 

C. Fluid Properties 

The specific heat capacity cp, the dynamic viscosity 

and the thermal conductivity have been defined as the 

following constant values in Table II. 

TABLE II 

AIR IDEAL GAS MATERIAL PARAMETERS 

cp 

J/kgK Pa s W/mK 

1004.4 1.831·10 -5 2.61·10 -5 

For an ideal gas, the density is calculated with the ideal 

gas equation [16]: 

n⋅p ρ = 

R ⋅T 

0 

abs 


(4) 

dh = cp dT 

(5) 

Here, n is the molecular weight, pabs is the absolute 

pressure, T is the temperature, R0 is the universal gas 

constant and is the density. 

These material properties from Table 2 have been 

adapted to implement the temperature dependence of the 

streaming fluid. In this case the specific heat capacity cp 

is expressed by the zero pressure polynomial [11] 

cp 

2 3 4 

= a1+ a2T + aT 3 + a4T + aT 5 (6) 

R 

S 

with the temperature T in Kelvin and the gas constant for 

air Rs = 287.058 J/kgK and the following coefficients: 

a1 = 3.57 , a2 = -4.3·10 -4 K -1 , 

a3 = -4.2·10 -8 K -2 , a4 = 3.1·10 -9 K -3 , 

a5 = -2.4·10 -12 K -4 . 

The values for the viscosity are approximated by the 

Sutherlands formula 

nμ 

μ T0+ Sμ T 

= , 

(7) 

μ0 

T + SμT0 

similarly to the conductivity 

λ 

λ 

T + S T 

+ 

0 λ 

= 

0 T SλT0 In these formulas, S and S stand for the Sutherland 

constant and n and n for the appropriate exponents. For 

the reference viscosity and reference conductivity the 

following values has been chosen from a material 

property table [22] at the reference temperature T0=325 K 

which is close to the mean operating temperature of the 

cooling fluid. 

S=77.80 K , 0=1.97·10 -5 Pa s , n=1.57 , 

S=60.71 K , 0=2.82·10 -3 W/mK , n=1.66 . 

The material properties are accurately approximated in 

the temperature range from about 260 K to 670 K with 

this approach. It is not recommended to use the same 

parameters outside this range of temperature [22]. 

D. Solid Properties 

The CHT method solves the following transport 

equation in the solid domains: 

nλ 

∂ρh ∂ρu h ∂ ∂T 

+ = λ+ S 

∂t ∂x ∂x 

∂x 

 

t s t 

j E 

 

j j j 

The important parameter in this equation is the thermal 

conductivity , which has a great influence on the results 

of the heat conduction and have to be known exactly. 

These parameters have been defined as isotropic for the 

copper, insulation, spacer, slot wedge, ripple spring and 

epoxy resin and as anisotropic for the iron. 

V. COMPARING NUMERICAL RESULTS WITH 

MEASUREMENTS 

The following figures show a comparison of 

measurement data (dashed line) and CHT solution data 

(solid line) for the three different parameters. The 

temperature values have been normalized in the following 

figures. 

A. Copper temperature 

The temperature difference in Fig. 9 is calculated with 

an average value of the copper temperature in the top and 

bottom bar and the fluid temperature at the inlet. The 

deviation is due to the copper temperature because the 

inlet temperature has been defined from the 

measurements for the CFD calculation and is exactly the 

same like in the measurement. 

An average deviation has been calculated with 1.98 % 

for model A, 0.95 % K for model B and 1.06 % for model 

C. The diagram shows that the differences between the 

models become smaller with a higher flow rate for the 

measurements contrary to the calculation. The difference 

of the results at the highest flow rate is calculated for 

model A with 3.54 %, for model B with 1.34 % and for 

model C with 1.57 %. 

. 

. 

(8) 

(9)

Normalized temperature difference 

1,1 

1,0 

0,9 

0,8 

0,7 

0,6 

Measurement A CHT A 

Measurement B CHT B 

Measurement C CHT C 

0,5 

0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 

Volume flow rate in m 3 /s 

Fig. 9: Normalized temperature difference between copper temperature 

and fluid inlet temperature for the three stator duct models 

The distribution of the temperature is plotted in Fig. 10 

for stator duct model A and B. The surface is defined at 

the middle of the first stator core (see Fig. 2) and the 

temperature is shown at the whole solid domains. 

Fig. 10: Temperature distribution through the middle of stator core 1 

with all parts; (a) model A, (b) model B and (c) model C 

The highest copper temperatures are found in model A 

(a). The asymmetric temperature in model B (b) is also 


recognizable in the iron; the temperature in the iron is 

higher on the opposite side of the ripple spring (bottom 

side) caused by the higher heat flux. The epoxy resin in 

model C (c) contributes a lower temperature in the 

insulation than along the air gap (see detailed in Fig. 10 

c). This will have a positive effect on the properties of the 

insulation during the aging. 

B. Iron temperature 

The iron temperature has been calculated as an average 

value of all measuring points (Fig. 2). The difference to 

the fluid inlet temperature has been plotted as before. The 

average deviation is 13.20 % for model A, 6.04 % for 

model B and 3.52 % for model C. It is worth noting that 

model A has the highest deviation for the iron 

temperature, see Fig. 11. The deviation for each winding 

assembly decreases with a higher volume flow rate. 


1,2 

1,1 

1,0 

0,9 

0,8 

0,7 

0,6 

0,5 

0,4 

0,3 

0,2 

0,1 




0,0 

0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 

Volume flow rate in m 

Fig. 11: Temperature difference between iron temperature and fluid 

inlet temperature for the three stator duct models 

3 /s 


C. Fluid temperatures 

The fluid temperature rise is shown in Fig. 12. 

1,2 

1,1 

1,0 

0,9 

0,8 

0,7 

0,6 

0,5 

0,4 

0,3 

0,2 

0,1 




0,0 

0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 

Volume flow rate in m 

Fig. 12: Difference between fluid inlet and outlet temperature 

3 /s

The temperature difference has been calculated with the 

inlet and outlet temperature of the air. The difference 

decreases with the volume flow rate. The average value 

of the difference is 0.97 % for model A, 1.09 % for model 

B and 0.99 % for model C. The highest deviation at the 

lowest flow rate is about 2.84 % for model A, 3.47 % for 

model B and 3.10 % for model C. 


The paper has described the conjugate heat transfer 

method for a stator model example. The advantage of 

using CHT is that the heat transfer coefficient is 

inherently solved and needs not be defined as constant at 

the surfaces. 

The comparison of the numerical solution shows a 

good agreement with measurements for each stator duct 

model. The average deviation of the temperature 

difference between copper and fluid inlet temperature has 

been less than about 1.4 % for all models. The 

temperature difference has been calculated between the 

iron and fluid inlet temperature. Therefore, the average 

deviation is under 8 %. The heating up of the air has been 

calculated with a difference less than 1.5 % and at the 

lowest operating point the difference reaches the maximal 

deviation of 3.1 % and the slightest deviation with the 

highest flow rate of about 0.2 %. This can be explained 

by the fact that steady state is reached faster for lower 

flow rates than for higher ones. Based on these results, 

the conclusion can be drawn that the best agreement is 

obtained for model C and the worst for model A. These 

investigations provide a determination of proper model 

conditions in the slot region, which can be used for 

further CHT researches. 

VII. ACKNOWLEDGMENT 

This work has been supported by the Christian Doppler 

Laboratory for Multiphysical Simulation, Analysis and 

Design of Electrical Machines (MuSicEl) and ANDRITZ 

Hydro GmbH. 

REFERENCES 

[1] E. Farnleitner and G. Kastner, “Contemporary methods of 

ventilation design for pumped storage generators,“ e&I, vol. 127, 

no. 1-2, pp. 24-29, 2010, DOI: 10.1007/s00502-010-0711-8. 

[2] G. Traxler-Samek, R. Zickermann and A. Schwery, “Cooling 

airflow, losses, and temperatures in large air-cooled synchronous 

machines,“ IEEE Transactions on Industrial Electronics, vol. 57, 

no. 1, pp. 172-180, Jan. 2010. 

[3] C. Kral, T. G. Habetler, R. G. Harley, F. Pirker, G. Pasoli, H. 

Oberguggenberger and C. J. M. Fenz, “Rotor temperature 

estimation of squirrel-cage induction motors by means of a 

combined scheme of parameter estimation and a thermal 

equivalent model,“ IEEE Transactions on Industry Applications, 

vol. 40, no. 4, July-Aug. 2004. 

[4] D. Staton, A. Boglietti, and A. Cavagnio, ”Solving the motor 

difficult aspects of electric motor thermal analysis in small and 

medium size industrial induction motors,” IEEE Transactions on 

Energy Conversion, vol. 20, no. 3, Sept. 2005, DOI: 

10.1109/TEC.2005.847979. 

[5] A. Boglietti and A. Cavagnino, “Analysis of the endwinding 

cooling effects in TEFC induction motors,” IEEE Transactions in 

Industry Applications, vol. 43, no. 5, pp. 1214-1222, 2007, DOI: 

10.1109/TIA.2007.904399. 

[6] B.D.J. Maynes, R.J. Kee, C.E. Tindall and R.G. Kenny, 

“Simulation of airflow and heat transfer in small alternators using 


CFD,” IEE Proceedings- Electric Power Applications, vol. 150, 

no. 2, pp. 146-152, 2003, DOI: 10.1049/ip-epa:20020754. 

[7] C. Kral, A. Haumer, M. Haigis, H. Lang and H. Kapeller, 

“Comparison of a CFD analysis and a thermal equivalent circuit 

model of a TEFC induction machine with measurements,“ IEEE 

Transactions on Energy Conversion, vol. 24, no. 4, pp. 809-818, 

Dec. 2009, DOI: 10.1109/TEC.2009.2025428. 

[8] M. Hettegger, B. Streibl, O. Biro and H. Neudorfer, 

“Measurements and simulations of the convective heat transfer 

coefficients on the end windings of an electrical machine,“ IEEE 

Transactions on Industrial Electronics, vol. 59, no. 5, pp. 2299- 

2308, May 2012, DOI: 10.1109/TIE.2011.2161656. 

[9] M. Schrittwieser, A. Marn, E. Farnleitner and G. Kastner, 

“Numerical analysis of heat transfer and flow of stator duct 

models,” XX th International Conference on Electrical Machines, 

Sept. 2012. 

[10] S. Klomberg, E. Farnleitner, G. Kastner and O. Bìrò, “Heat 

transfer analysis on end windings of a hydro generator using a 

stator-slot-section model,” 15 th IGTE Symposium, Sept. 2012 

[11] ANSYS Inc., “ANSYS CFX- Solver Modeling Guide”, Release 

13.0, ANSYS Inc. 

[12] W. Vieser, T. Esch and F. Menter, “Heat transfer predictions 

using advanced two equation turbulence models,“ CFX Technical 

Memorandum, 2002. 

[13] C.C. Hwang, S. Wu and Y. Jiang, “Novel approach to the solution 

of temperature distribution in the stator of an induction motor,” 

IEEE Transactions on Energy Conversion, vol. 15, no. 4, pp. 401- 

406, Dec. 2000, DOI: 10.1109/60.900500. 

[14] S. Mezani, N. Takorabet and B. Laporte, “A combined 

electromagnetic and thermal analysis of induction motors,” IEEE 

Transactions on Magnetics, vol. 41, no. 5, pp. 1572-1575, May 

2005, DOI: 10.1109/TMAG.2005.845044. 

[15] L. Weili, C. Guan and P. Zheng, “Calculation of a Complex 3-D 

Model of a turbogenerator with end region regarding electrical 

losses, cooling, and heating,” IEEE Transactions on Energy 

Conversion, vol. 26, no. 4, pp. 1073-1080, Dec. 2011, DOI: 

10.1109/TEC.2011.2161610. 

[16] ANSYS Inc., “ANSYS CFX- Solver Theory Guide,” Release 

13.0, ANSYS Inc. 

[17] F. Kreith and M. S. Bohn, Principles of Heat Transfer, 6 th ed. 

Southbank, Australia: Thomson Learn., 2001. 

[18] H. K. Versteeg and W. Malalasekera, “An Introduction to 

Computational Fluid Dynamics – The finite volume method”, 

[19] F. R. Menter, “Two- equation eddy- viscosity turbulence models 

for Engineering Applications,” AIAA Journal, vol. 32, pp. 1598- 

1605, Aug. 1994. 

[20] D. C. Wilcox, Turbulence Modeling for CFD: Solutions Manual. 

2 nd Edition, La Canada, CA: DCW Industries Inc., 1994. 

[21] B. Launder and D. Spalding, Mathematical models of turbulence. 

London, U.K.: Academic Press, 1972. 

[22] VDI Heat Atlas, 10 th ed. Berlin, Germany: Springer- Verlag, 2006.


Computing the shielding effectiveness of waveguides using FE-mesh 

truncation by surface operator implementation 

C. Tuerk∗ , W. Renhart † , and C. Magele † 

∗Armament and Defence Technology Agency, Ministry of Defence and Sports, Rossauer Laende 1, A-1090 Vienna 

† Institute for Fundamentals and Theory in Electrical Engineering, Inffeldgasse 18, A-8010 Graz 

E-mail: christian.tuerk@bmlvs.gv.at 

Abstract—A plane wave incident perpendicular to one open end of a conductive tube, as part of a honeycomb-structure, is 

attenuated on its way through it. In order to calculate its total attenuation for various frequencies the FE-method will be 

used. This requires a reflectionless truncation of the FE-mesh for which a Surface Operator Boundary Condition (SOBC) 

will be employed. In order to show the accuracy and applicability of the FEM with SOBC, the results will be compared 

to entirely analytical solutions as well as to easy-to-use engineering formulae. 

Index Terms—Finite Element Method (FEM), Shielding, Surface Operator Boundary Condition (SOBC), Waveguide 


Previous works i.e. [1] have shown the implementation 

of a surface operator boundary condition derived from 

an analytical model into the FE-mesh. Honeycombs can 

be considered waveguides-beyond-cutoff (WBC) and are 

therefore employed as vents for large shielded enclosures, 

like shielded rooms, while maintaining a certain degree 

of attenuation of a plane wave incident on it. 

The resulting attenuation imposed by a single conductive 

tube will be calculated under different ratios of 

length-to-diameter of the tube and at selected frequencies. 

Existing literature like [2] provide engineering rules for 

designing waveguides-beyond-cutoff (WBC) as a shielding 

component whereas others like [3] present analytical 

details on the physics of the transmission of electromagnetic 

power in waveguides of various cross-sections. The 

results, obtained through numerical computation in the 

frequency range of 3GHz to 18GHz for a practical 

design of a real life waveguide are then compared to 

both approaches and subsequently discussed. 

II. MODELLING 

A. Surface Operator Boundary Conditions (SOBC) 

Fig. 1 shows the setup used for the computation of the 

plane wave field strength incident on the tube and exiting 

it. The right-hand-side boundary is modelled by means 

of SOBCs (Γtr) matching the impedance of free space 

between the tube and the termination of the problem area. 

A plane wave travelling along the x-axis will experience 

a certain degree of attenuation by the waveguide as long 

as the waveguide-beyond-cutoff (WBC) condition is met. 

A fraction of the initial power of the wave penetrates the 

waveguide and is terminated reflectionless at Γtr. 

Based on results obtained through [5] and [4] the implementation 

of this surface operator boundary condition 

for the truncation surface Γtr of the FE-mesh can be 

directly derived from the Maxwell equations 

∇× E = −jωμ H, ∇× H = jωɛ E. (1) 

After splitting the field vectors E and H as well as the 

∇-operator into their normal and orthogonal tangential 

Fig. 1. Modelling the Waveguide 

components as in 2 

E = Et + nEn, H = Ht + nHn, ∇ = ∇t + ∂ 

n (2) 

∂n 

the Maxwell Equations can be reformulated as follows: 

n Hn = − 1 

jωμ ∇t × Et 

(3) 

n En = 1 

jωɛ ∇t × Ht. (4) 

With these relations the normal components of the field 

components En and Hn can be eliminated in equation 1. 

A couple of mathematical operations finally yield 

∂(n × Et) 

= −jωμ 

∂n 

Ht − 1 

jωɛ∇t × (∇t × Ht) (5) 

∂(n × Ht) 

= jωɛ 

∂n 

Et + 1 

jωμ∇t × (∇t × Et). (6) 

These equations are commonly valid, consequently on the 

truncation surface (see fig. 1) too. On Γtr the situation 

is as shown in fig. 2 in a local coordinate system. 

The propagation of the wave can be represented by the 

wave vector k. Due to the knowledge of the angle of 

incidence on Γtr it can be decomposed into its normal 

and tangential components as given in the following set 

of equations: 

k = kt 

+ 

β, β = ± k2 − kt 2 , k = ω √ μɛ. (7)

Fig. 2. Wave at any point on Γtr 

The surface normal n is represented by the local coordinate 

ζ. In order to get rid of the ∂ 

∂n term on the left-handside 

of equation 5 and equation 6 an integration along ζ 

over the half-space must be performed. Assuming a lossy 

media, the field components must decay to zero at infinity 

which allows for 

∞ 

Ht0e 

ζ=0 

−jβζ dζ = 1 

jβ Ht0 

(8) 

∞ 

Et0e −jβζ dζ = 1 

jβ Et0. 

ζ=0 

 

Ht0 and 

Et0 are the tangential field vectors at ζ =0. 

Together with equation 7, relations 5 and 6 can now be 

rewritten as 

n × 

Et0 = 

n × 

Ht0 = 

−ωμHt0 

 

k2 − kt 2 + ∇t × (∇t × Ht0) 

ωɛ k2 − kt 2 

(9) 

ωɛEt0 

 

k2 − kt 2 − ∇t × (∇t × Et0) 

ωμ k2 − kt 2 

. (10) 

Transverse components of the outgoing wave may be 

transformed into the Fourier domain, only to see, that its 

tangential derivatives can be expressed as ∇t = −j kt. 

Substitution in equation 9 and 10 leads to 

n × 

Et0 = 

n × 

Ht0 = 

−ωμHt0 

 

k2 − kt 2 − kt × ( kt × Ht0) 

ωɛ k2 − kt 2 

(11) 

ωɛEt0 

 

k2 − kt 2 + kt × ( kt × Et0) 

ωμ k2 . 

2 

− kt 

(12) 

These relations between the tangential components of 

Et0 

and Ht0 can now be used to model the so called 

surface operator boundary conditions (SOBC) on Γtr. 

Equations 11 and 12 allow for any angle of incidence of 

the plane wave on a truncating surface Γtr. Since only 

perpendicular incidence on the waveguide and on Γtr are 

considered, the use of a first-order SOBC is reasonable 

- kt =0. 

Application of the Galerkin method to the well-known 

A, v-formulation makes use of the n × Ht on the Neu- 


mann Boundary (see [6]). 

− 

+ 

Ω 

+ 

ΓH 

Ω 

∇× Ni · 1 

μ ∇× AdΩ 

Ni · (n × ( 1 

μ ∇× A)) dΓ 

 

n× H 

Ni · (σ + jωɛ)jω( A −∇v)dΩ =0. (13) 

On the Neumann boundary (ΓH) the underbraced term 

in equation 13 is substituted by the Fourier transformed 

integral of equation 6 which prescribes the truncation of 

the FE-mesh directly. 

B. Surface Impedance Boundary Conditions (SIBC) 

An increased incident angle results always in a larger 

wave vector kt and obviously the curl curl-terms in 

equations 11 and 12 become more and more relevance 

to achieve accurate boundary conditions. If the wave 

propagates perpendicularly to Γtr, the vector kt equals 

zero. This is the considered case for all results presented 

herein. Hence the second term on the right-hand-side in 

equations 11 and 12 equal zero and first order SIBCs 

remain: 

n × 

 

−ωμHt0 μ 

Et0 = = − 

k 

ɛ Ht0 = −Z0 Ht0 (14) 

n × 

ωɛ 

 

Et0 ɛ 

Ht0 = = − 

k μ Et0 = 1 

Et0 

(15) 

Z0 

The impedance of the mesh-terminating plane Γtr can 

now be directly prescribed. 

III. SETUP 

Fig. 1 shows the setup used for the computation of the 

plane wave field strength incident on the tube and exiting 

it. The right-hand-side of the problem area is terminated 

by means of the introduced SOBC. A plane wave originating 

from the stimulus plane penetrates the tube. Only 

a fraction of the incident power ”leaks” through it, since 

at the frequencies considered it represents a waveguidebeyond-cutoff 

(WBC). This small fraction of the incident 

wave is terminated reflectionless at Γtr. The detail of the 

aluminium tube with a square cross-section and lengths 

ranging from 20mm ... 80mm is shown in fig. 3. 

The grid shown in figure 3 represents the macro 

elements used for modelling only. 

IV. RESULTS 

A. Finite Element Method with SOBC 

Since frequencies above 1GHz are of interest, simulations 

at distinct frequencies in the range of 3 ... 18GHz at 

a stepwidth of 3GHz are considered. At each frequency 

the length of the tube is stepped through by 10mm in 

the range between 20mm and 80mm. The cross-section 

of the waveguide is kept constant. Fig. 4 shows the 

resulting attenuation of a plane wave on its way through

Fig. 3. Details of the waveguide-beyond-cutoff 

the WBC. At 15GHz the attenuation of the incident 

wave starts to approach zero and the tube becomes a 

waveguide as known from RF-applications and has also 

been described in [3]. As long as the frequencies are 

Fig. 4. Attenuation of a plane wave at distinct lengths and frequencies 

below the cutoff-frequency, the attenuation does not only 

depend on the ratio between f, the frequency used, and 

the cutoff-frequency fc of the structure, but also depends 

on the length of the tube. The relationship is non-linear 

and therefore clearly contrasting the engineering rulesof-thumb 

as provided in the following section. 

The following figure (Fig. 5) shows the computation 

of the field strengths on either side of the waveguidebeyond-cutoff. 

It is operated at 9GHz and the righthand-side 

is terminated by means of the SOBC described 

before. The colors in the figure refer to the absolute value 

of the field strengths of the electrical component of the 

plane wave at a particular moment. Due to the necessity 

of a fine mesh for the computation of fields along the 

waveguide (coloured grey), no field strengths are visible. 

Following the general formula of the power density of a 

plane wave 

S = 1 

2 Re( E × H∗ ) (16) 

and the impedance of free space of 

Z0 = 

μ0 

ɛ0 


Fig. 5. A waveguide 30mm in length, operated at 9GHz 

the attenuation of the power through the waveguide can 

be calculated. With 

at =20lg |Emaxin| 

|Emaxout| 

(17) 

the degree of the attenuation (at)[dB] can be determined 

based on the field strength of the electrical component 

of the plane wave on the left-hand-side of the tube 

(|Emaxin|) and on the right-hand-side (|Emaxout|). The 

maxima of the respective field strengths are taken from 

a line parallel to the x-axis along the centre of the tube. 

B. Engineering Rules 

For applications using frequencies below approximately 

1GHz [2] proposes the use of simple ”design 

rules”: 

fc = 150 

b , fc[GHz], diameter[mm] (18) 

at = 27.3 

b l, at[dB], 

b = 

diameter, length[mm] (19) 

√ 2a, forsquare cross − section[mm] 

f ≤ fc 

10 , usablefrequencyf (20) 

with fc being the cutoff-frequency in [GHz], at representing 

the shielding effectiveness in [dB] and any 

dimension given in [mm]. Formulae 18 to 20 show that 

the cutoff-frequency only depends on the diameter of the 

tube which is, to some degree, in accordance with [3]. 

It has to be distinguished whether a square, rectangular 

or circular waveguide is used. As for the rectangular 

cross-sections [3] reads that the larger dimension governs 

the cutoff-frequency fc. For circular shapes the diameter 

counts. One may also have noticed that the engineering 

rules do not account for any matter in the waveguide 

but free space. Since the WBC is used as a vent with 

shielding properties its cutoff-wavelength follows 

λc = c0 

fc 

≈ 2a. (21) 

This is in line with [3] and equation 22 if ɛ = ɛ0 and 

μ = μ0. Waveguides filled with dielectric matter for 

transmission properties are beyond the scope of this work 

since they are neither useful as vents nor as a shielding 

component.

As long as the frequency of interest is below the 

highest usable frequency as given in equation 20 the 

tube yields an attenuation according to equation 19. 

Application of this set of formulae to the waveguide 

under consideration at 12GHz provides the following 

graph (fig. 6): 

Fig. 6. Engineering rules applied at 12GHz 

Figure 6 shows the application of the engineering rules 

at the cutoff-frequency fc = 12GHz. The calculation 

of the shielding effectiveness with the engineering rules 

(blue dashed line) naturally exceed the limits obtained 

by means of the numerical value since equation 20 has 

not been considered so far. This equation is obviously a 

very rough estimate of the maximum usable frequency. 

It requires this waveguide not to be used above 1.2GHz. 

This is very conservative, since the green solid line 

(the uppermost line) shows the course of the shielding 

effectiveness at 9GHz of this particular waveguide. The 

engineering rules yield similar results, but on the safe 

side. Since it is not clear which limit in terms of shielding 

effectiveness underlies this set of easy-to-use engineering 

rules, one has to be very careful with its application. Even 

if it was possible to adjust equation 20 to this result, the 

behaviour of a waveguide may render this unreliable due 

to its nonlinear attenutation of a plane wave as fig. 4 

clearly shows. 

C. Analytical Approach 

When considering a waveguide-beyond-cutoff (WBC) 

for shielding purposes, the lowest mode of a TE or TMwave 

propagating through it is of interest. It represents 

the cutoff-frequency fc. For waveguides with a square 

cross-section [3] reads for TE10-mode 

fc = 1 

2 √ 1 

(22) 

ɛμ a 

with a being the length of the edge of the square. For a 

waveguide as used for this work, fc =14.99GHz which 

matches the result shown in figure 4. With increasing 

frequencies the attenuation of the plane waves vanishes 

above approximately 15GHz regardless of the length of 

it. In other words, illuminating this particular waveguide 

at frequencies ≥ 15GHz will render it useless as a shield. 


Since waveguides are generally used for transmission 

of electromagnetic energy there are, apart from the engineering 

rules above, no analytical formulations available 

to determine the attenuation of a plane wave penetrating a 

waveguide below its cutoff-frequency - there is no distinct 

mode of energy flow in the waveguide. For the same 

reason there are no analytical formulations known for 

plane waves penetrating a waveguide at other angles than 

perpendicular to the cross-section of it (see section V). 

V. CONCLUSION 

This paper shows how Surface Operator Boundary 

Conditions (SOBC) can be implemented in an A − v 

formulation to be used with the Galerkin method. The 

SOBC are used to model a Neumann Boundary Condition 

which allows for reflectionless termination of a problem 

area. The use of the SOBC allows for a significant 

speed-up of the computation of the problem because the 

absorbing boundary is only a single term which does not 

require additional finite elements to be modelled. For the 

construction of vents in a shielded room, waveguides below 

their cutoff-frequencies are employed. The described 

model has been used for the computation of the shielding 

effectiveness of waveguides at frequencies exceeding 

1GHz and compared and contrasted to an analytical 

approach and a set of easy-to-use engineering rules. It 

can now clearly be shown, that well known and verified 

analytical solutions can be met by numerical models 

as far as the cutoff-frequency of square waveguides is 

concerned. By the same token, it can be shown that 

simple design rules are very conservative i.e. delivering 

smaller numbers of shielding attenuation than actually 

can be yielded in real designs. It can not be said, that this 

set of easy-to-use rules are valid only below ≈ 1GHz. 

So far, only plane waves incident perpendicular to 

an open end of the waveguide have been modelled and 

computed. Future efforts will be put on different angles 

of incidence. There exist hints, that stacked arrays of 

waveguides (honeycomb structures) suffer a deterioration 

of total shielding effectiveness compared to the attenuation 

provided by a single tube. This behaviour may also 

be investigated in the future. 

REFERENCES 

[1] W. Renhart, C. Magele and C. Tuerk, ”Thin Layer Transition 

Matrix Description Applied to the Finite Element Method”,IEEE 

Trans on Magn., Vol. 45, No. 3, 2009, pp. 1638- 1641. 

[2] Louis T. Gnecco, ”The Design of Shielded Enclosures”, Newnes 

Press, ISBN 0-7506-7270-6 

[3] Karoly Simonyi, ”Theoretische Elektrotechnik”, 6. Auflage, VEB 

Deutscher Verlag der Wissenschaften, Berlin 1977 

[4] W. Renhart, C. Magele and C. Tuerk, ”Improved FE-mesh truncation 

by surface operator implementation to speed up antenna 

design” (unpublished). 

[5] Sergei Tretyakov, Analytical Modeling in Applied Electromagnetics, 

1st ed. Artech House, chapters 2, 3, 2003. 

[6] O. Biro, ”Edge element formulations of eddy current problems”, 

Computer methods in applied mechanics and engineering, vol. 169, 

pp. 391-405, 1999.


Heat Transfer Analysis on End Windings of a Hydro 

Generator using a Stator-Slot-Sector Model 

1, 2 Stephan Klomberg, 3 Ernst Farnleitner, 3 Gebhard Kastner, 1, 2 Oszkár Bíró 

1 Christian Doppler Laboratory for Multiphysical Simulation, Analysis and Design of Electrical Machines, 

Inffeldgasse 18, A-8010 Graz, Austria 

2 Institute for Fundamentals and Theory in Electrical Engineering, Inffeldgasse 18, A-8010 Graz, Austria 

3 Andritz Hydro GmbH, Dr.-Karl-Widdmann-Strasse 5, A-8160 Weiz, Austria 

E-mail: stephan.klomberg@tugraz.at 

Abstract — An accurate evaluation of the convective heat transfer coefficient on end windings needs usually large numerical 

models. These calculations involve an enormous amount of time and are not feasible for finding correlations between the 

convective heat transfer coefficient, massflow, rotational speed and geometry. On the basis of a parameter study this paper 

shows that a simplified stator-slot-sector model is equally accurate as a pole-sector-model but more practicable and faster. 

Index Terms—Cooling, Electric machines, Fluid dynamics, 

Heat transfer. 


Large hydro generators are working with a high 

efficiency nevertheless the losses of these machines can 

reach up to several MW’s. These heat losses must be 

dissipated from the generator. Designing the cooling of a 

generator is nowadays well-engineered. The use of flow 

and thermal networks in this subject is state of the art [1]. 

Flow networks decompose the complex geometry in 

discrete network elements to calculate the air flow 

through a machine. They include pressure generating 

elements (active) like fans and poles or other rotating 

components and passive elements like ducts, inlets or 

outlets. The fundamentals of these components are 

determined theoretically, by measurements or 

computational fluid dynamics (CFD). The computation of 

the temperature rise in the active parts is handled with 

thermal networks where the convective heat transfer and 

heat conduction coefficients and a reference air 

temperature are major parameters. Examples about flow 

and thermal networks are found in [1] and [2]. 

Strictly speaking, the most important factor, the 

convective heat transfer coefficient, has to be calculated 

with large numerical generator models. Analyzing these 

models needs much time and is inappropriate for 

parameter studies. The need of coefficients for the 

networks makes the consideration of an equivalent 

smaller model enabling a faster calculation rational. 

The standard equation for the convective wall heat 

transfer coefficient is 

q W 

= 

(1) 

TW- Tref According to [3] this equations is applicable for forced 

convection if q W is the wall heat flux density, TWall the 

wall temperature and Tref a reference temperature in a 

properly control surface in the calculating volume. 

In the last years, several investigations have been 

carried out on the topic of heat transfer on end windings 

of electrical machines especially for totally enclosed fan 

cooled induction motors. The most development has been 

on smaller machines in a power class of a few kVA. One 

method obtains the heat transfer coefficients by 

measuring temperatures and implements these 

temperatures in thermal resistances [4], [5]. A second 

kind of approach involves CFD calculations combined 

with measurements [6], [7]. The end winding heat 

transfer of large hydro generators have not yet been 

investigated. 

The large hydro generator presented in this paper is air 

ventilated by a motor-driven fan in radial direction. A 

longitudinal section of the investigated generator is 

shown in Figure 1. The fluid enters the machine at the 

inlet (a) without a spin, flows through the end winding 

bars (d, e) in the pole area (f) and through the stator ducts 

(h) to the outlet (j). 

b 

c 

d 

e 

axis of rotation 

a j 

Figure 1: Profile of the investigated hydro generator 

showing the (a) inlet, (b) bearing support, (c) support 

ring, (d) bottom bar, (e) top bar, (f) salient pole, (g) airgap, 

(h) stator ducts, (i) outlet area, (j) outlet, (k) shaft 

The purpose of this paper is to develop a so called slotsector 

model which is smaller and faster to calculate than 

k 

h 

i 

f 

g

a standard model with all components and an enormous 

number of elements. The slot-sector model should be 

investigated and optimized for calculating an accurate 

heat transfer coefficient. 

II. NUMERICAL SIMULATION OF THE HEAT FLUX 

Turbulence models are one of the most important parts 

in numerical fluid simulation. Therefore the two most 

commonly used models, the standard k- model by Jones 

and Launder [8] and the shear stress transport model by 

Menter [9] have been compared. 

The fundamental equation to calculate the heat flux at 

the wall is 

W= cpu * 

T + (TW -T) (2) 

where is the density and cp is the specific heat capacity. 

It should be pointed out that the two turbulence models 

apply different approaches for the dimensionless near 

wall velocity u * and the dimensionless temperature at the 

wall T + . Vieser et al tested and explained these heat 

transfer predictions in [10] for different test cases. 

A strong impact on the wall treatment has the density 

of the used mesh near walls described by the 

dimensionless distance from the wall 

y + = u y 

 

(3) 

This parameter depends especially on the height of the 

first element adjacent to the wall y. The other 

parameters are the friction velocity u and the kinematic 

viscosity . The smaller y is chosen, the more accurate 

the calculated heat transfer coefficient becomes. A 

parameter study in section IV will show this correlation. 

A short overview of the influence of y + on the 

convective heat transfer coefficient is 

T + =fy + 

=fW W=fT + , u * y + =f(y) 

u * =fy + (4) 

 

The commercial software package ANSYS-CFX-13 

[11] has been used for the numerical simulations 

described in this paper. There are two main calculation 

methods for a rotor stator simulation, the transient and the 

steady-state approach. A transient calculation needs large 

computing resources and takes a long time. Therefore the 

steady-state method has been chosen. There are two 

variants, the stage method and the frozen rotor method. 

These steady-state approaches are only approximations 

because the transient terms in the flow equations are 

neglected. Nevertheless, their balance of computational 

efficiency and accuracy is ideal for parameter studies 

with many working points. They differ in the treatment of 

the interface between two components. The stage method 

averages the fluxes in circumferential direction on bands 

and transmits these fluxes to the downstream component. 


Only one passage per component has to be modeled, and 

furthermore, it can be used for large pitch ratios which 

highly reduce the number of elements. 

The frozen approach works with a frame change at the 

interface without averaging the fluxes. Therefore, it needs 

to model more passages per component. The conservation 

equations for the rotor are solved in a rotating system, the 

equations of the stator in a static frame of reference. The 

consistency of speed and pressure is combined at the 

interface. These relations are illustrated in Figure 2 and 

explained in detail in [11]. 

Figure 2: Steady-state methods: stage and frozen rotor 

R1/ R2 and S1/ S2 are rotational periodicities; pR/ pS are 

pitch ratios 

The standard setting in ANSYS-CFX-13 for ideal gas 

is temperature independent, i.e. the thermal conductivity 

, the specific heat capacity cp and the dynamic viscosity 

are constant. This is a simplification of reality which 

may make the results more inaccurate. An ideal gas with 

temperature dependence has been modeled as a 

consequence, and compared to a temperature independent 

ideal gas. 

The dynamic viscosity (5) and the thermal conductivity 

(6) have been modeled with the Sutherland’s formula 

[11]. The reference temperature Tref has been set to 325 

K. The reference molecular viscosity o, the reference 

molecular conductivity o, the Sutherland constants S/ S 

and the temperature exponents n/ n are listed below for 

both equations. 

 

0 

 

0 

S = 77.8 K 

0 = 1.972 10 -5 Pa s 

n = 1.574 

= Tref + S 

T + S T 

n 

 

Tref (5) 

= Tref + S 

T + S T 

n (6) 

Tref S = 60.7 K 

0 = 2.82 10 -2 W / m K 

n = 1.676 

As illustrated in equation (7), the specific heat capacity 

has been calculated with the zero pressure polynomial 

[11]. 

cp RS = t 1 +t 2 T+t 3 T 2 +t 4T 3 +t 5T 4 (7)

RS = 287.058 J / kg K 

t1 = 3.574 

t2 = -4.2691 10 -4 

t3 = -4.1854 10 -8 

t4 = 3.0986 10 -9 

t5 = -2.3848 10 -12 

All physical values have been found by automatically 

adjusting them to measured thermodynamic properties of 

dry air gathered in [12]. These values are valid in a 

temperature range from 260 K to 670 K. 

III. EXPLANATION OF THE 3D MODEL 

This chapter shows the structure of the reference model 

called pole-sector model (PSM). Four different slot-sector 

models (SSM) will be explained, too. 

The reference model has been reduced to one pole 

sector of the whole circumference of the generator. A 

rotational periodic condition is given at both 

circumferential sides. A symmetry condition in axial 

direction further reduces the numerical effort. The model 

is shown in Figure 3. It is simulated with the frozen rotor 

approach and the number of elements is about 30 million. 

The calculation time of the PSM is longer than a week 

because of this large amount of elements. Nevertheless, 

the mesh of this model is rather coarse over the whole 

volume. This fact is especially important near the wall of 

the end windings. 

c 

d 

b 

a 

Figure 3: Pole-sector model: (a) inlet, (b) bearing support, 

(c) support ring, (d) bottom bar, (e) top bar, (f) salient 

pole, (g) air-gap, (h) stator ducts, (i) outlet area, (j) outlet 

Measuring temperatures at walls in a large hydro 

generator demands a high effort and a long preparation 

time. The measuring sensors must be fixed during the 

construction of the components, which makes such 

investigations complicated and expensive. Not least due 

to these facts, the calculated wall heat transfer 

coefficients (WHTC) of the PSM have been taken as 

e 

j 

f 

i 

h 

g 


reference values. By means of simulating several working 

points with different volume flow rates and rotational 

speeds, a wide scope has been covered. The volume flow 

rate is set as the inlet boundary condition and the static 

pressure as the outlet condition. All walls, especially the 

end winding walls, of the model have a fixed 

temperature. Conduction in solids is not considered. 

The aim of the investigations is developing a 

simplified model with acceptable computational demands 

for a numerical parameter study. The best fitting 

computational approach for this issue is the stage model. 

The question is how much can the PSM be reduced by 

maintaining similar accuracy. To clarify this, four 

different simplifications have been modeled. 

The first idea was to reduce the model as much as 

possible. In order to achieve this, the whole generator has 

been reduced to a circumferential section of one slot. 

Furthermore, the rotor, the stator ducts and the outlet area 

are not considered. The interface between the rotor 

domain and the inlet domain as well as the air gap serves 

as a simplified outlet. Due to this, it is difficult to find an 

appropriate boundary condition at the simplified outlet. 

Only one end winding bar is considered and a diffuser 

has been put in front of the inlet to get a radial inflow 

onto the end winding area. The number of elements is 

only about 0.6 million due to all these reductions. This 

slot-sector model is named SSM_1. 

The next step was extending the model SSM_1 with 

the rotor domain to get the second model (SSM_2). The 

outlet is moved to the symmetry plane of the rotor. The 

interface between the rotor and the stator ducts is 

assumed to be a fixed wall. The number of elements is 

less than 1 million. 

The third model (SSM_3) is enhanced with the stator 

ducts and the outlet area. These parts have also a 

circumferential extension of one slot only. Because of 

this, the same boundary conditions as in the PSM are 

possible. The number of elements rises to 1.5 million. 

The last remaining problem has been the inflow. A slot 

section of the inlet domain doesn’t allow a three 

dimensional spreading of the flow onto the end windings. 

The consideration of the entire inlet area leads to the last 

simplified slot-sector model called SSM_4. The final 

model includes all components, but it contains only one 

slot with a pitch ratio of 22.5, i.e. one end winding bar 

and its surrounding stator ducts are modeled. This model 

has the best features for using the steady-state approach 

stage and a circumferential averaging of the WHTC is 

expected to be appropriate. 

The number of elements has been reduced to 2 million. 

On the one hand, only one slot section has been modeled, 

and on the other hand, the rotor and the inlet domains 

have been geometrically simplified and meshed coarser 

than the components of the PSM. The meshes of the end 

winding bars of the PSM and of all four SSM have the 

same structure and mesh density. 

An accurate prediction of the heat transfer coefficient 

is possible with a fine near wall mesh only. By means of 

a parameter study, the influence of the mesh density on 

the WHTC has been investigated. The end windings’ 

domain is meshed starting from an extremely coarse grid

to a very fine one. These various meshes are illustrated in 

Table 1. 

TABLE I 

DIFFERENT MESHES OF THE END WINDING BAR 

y 1.element number of 

[mm] elements 

 

5,00 45.000 

3,00 65.000 

2,00 81.000 

1,00 146.000 

0,50 318.000 

0,25 693.000 

0,12 989.000 

0,06 1.492.000 

0,05 1.682.000 

The focus has been on a defined height of the first 

element at the walls. The rest of the volume is 

automatically meshed with a defined ratio of growth and 

a Poisson distribution normal to the wall [13]. 

IV. RESULTS 

The evaluation has been carried out by calculating the 

WHTC at the end windings as defined in (1). Further 

results are normalized to the reference values for a better 

overview. The end winding bar is split into 5 zones to get 

the variation of the WHTC along the bar. Figure 4 shows 

the zones, beginning with T1 and T2 on the top bar. TB3 

is the junction from the top to the bottom bar and it is 

followed by B2 and B1. 

Figure 4: End winding bar with 5 zones 

Figure 5 shows the comparison of the WHTCs 

obtained by the four slot-sector models along an end 

winding bar. As a criterion for an acceptable agreement 

between the PSM and the SSMs, a ratio PSM / SSM in 

the range of 0.8 - 1.2 has been chosen. The first 3 slotsector 

models cannot fulfill this target, especially the area 

TB3 at the end of the bar is too inaccurate. The version 

SSM_4 is just in the range, except in zone T1. This area 


of the top bar is located at the beginning of the air gap 

and the rotor and is highly influenced by the motion of 

the rotor. This can be also seen in Figure 6, detail x and y, 

where the velocity is very high. 

PSM / SSM 

2,0 

1,8 

1,6 

1,4 

1,2 

1,0 

0,8 

0,6 

0,4 

0,2 

SSM_1 SSM_2 

SSM_3 SSM_4 

0,0 T1 T2 TB3 B2 B1 

Figure 5: Comparison of the four slot-sector models 

Figure 6 shows the comparison of the velocity contours 

in the symmetry plane of the inlet for the models SSM_4 

and SSM_3. The contours of SSM_1 and SSM_2 are 

similar to SSM_3. These pictures essentially show that 

the inflow velocity is higher if the whole inlet is used for 

the calculations. Hence the ratios of the slot-sector 

models 1-3 in the zones TB3, B2 and B1 in Figure 5 are 

out of the range of 0.8 - 1.2. 

x y 

Figure 6: Velocity in the symmetry plane of the inlet 

domain of a) SSM_4 and b) SSM_3 

The interaction between the rotor and the end windings is 

illustrated in Figure 7. The turbulent kinetic energy at the 

interfaces between the inlet and the rotor as well as 

between the top bar and the inlet is shown. There are 

vortices with high energy at the PSM contour seen in 

Figure 7a. The computation with the model SSM_4

generates the well known circumferential bands (see 

Figure 7b) characteristic of the stage method. In other 

words, by averaging the physical values on 

circumferential bands it is not possible to calculate local 

vortices. Therefore the use of a slot-sector model 

underestimates the rotor stator interaction. 

Inlet – Top bar 

Inlet – Top bar 

Inlet - Rotor 

Inlet - Rotor 

Figure 7: Turbulent kinetic energy on selected interfaces 

in a) PSM, b) SSM_4 

The graph in Figure 8 shows the heat transfer 

coefficient in dependence on the dimensionless distance 

from the wall at the top bar. The SST and the k- 

turbulence models have been used for this study. The 

WHTC increases with decreasing dimensionless distance 

from the wall. The coefficient reaches its peak at about y + 

= 5 and fluctuates around the maximum value. The factor 

y + is very sensitive to varying the near wall velocity due 

to a different volume flow rate or rotational speed with 

the same mesh and geometry. This mesh refinement study 

confirms the investigations of [14]. 


y=min 

1,1 

1,0 

0,9 

0,8 

0,7 

0,6 

0 1 

y 

10 100 

+ [-] 

Figure 8: Mesh refinement study in the zones T1, T2, 

TB3 with the k- and the SST turbulence model 

Depending on the previous investigations, a parameter 

study with various working points has been carried out. 

The slot-sector model SSM_4 has been used with 

different operating conditions but the SST turbulence 

model has always been applied. The results have been 

averaged and a standard deviation has been calculated. 

Figure 9 shows the results obtained. 

PSM / SSM 

1,4 

1,2 

1,0 

0,8 

0,6 

0,4 

0,2 

T1 SST T1 k-e 

T2 SST T2 k-e 

TB3 SST TB3 k-e 

averaged ratio (y+ < 30, ideal gas temperature independent) 

averaged ratio (y+ < 1, ideal gas temperature independent) 

averaged ratio (y+ < 1, ideal gas temperature dependent) 

standard deviation 

0,0 T1 T2 TB3 B2 B1 

Figure 9: Normalized WHTC in dependence of y + and the 

type of ideal gas 

First, the SSM has been calculated with the same mesh 

density near the end winding walls as the PSM. The 

results are located in the given range of 0.8 – 1.2. The 

second simulation run has been done with the finest mesh 

of the end winding domain. The curve is decreasing 

nearly parallel to the first one. The reference model has 

been only simulated with a coarse mesh near the end 

winding bars, hence an estimation of the accuracy is not 

possible. These two curves have been calculated with a 

temperature independent ideal gas. Therefore, the last 

curve has been simulated with an ideal gas with 

temperature dependency. The ratio of the heat transfer 

coefficients increases about 5% with temperature 

independence assumed. 

Regarding these findings, it can be stated that a slotsector 

model with a very fine mesh near walls and an 

adjusted ideal gas leads to sufficiently accurate results.

V. CONCLUSION 

The comparison of a pole-sector model with various 

slot-sector models shows that a simplification with less 

numerical effort is possible. An extreme reduction of the 

generator is not recommended, because all components 

have to be considered. Due to modeling one slot only, the 

stage approach is an adequate and fast calculating method 

for this kind of model structure. The averaged deviation 

of the wall heat transfer coefficient from the reference 

values is about 12%. Possible improvements of the slotsector 

model are the use of an adjusted ideal gas and a 

fine mesh near walls. Furthermore, the influence of the 

dimensionless distance from the wall has been confirmed. 

ACKNOWLEDGMENT 

This work has been supported by the Christian 

Doppler Research Association (CDG) and by the Andritz 

Hydro GmbH. 


REFERENCES 

[1] E. Farnleitner and G. Kastner, "Moderne Methoden der 

Ventilationsauslegung von Pumpspeichergeneratoren," e&i, vol. 

127, pp. 24-29, 2010. 

[2] G. Traxler-Samek, R. Zickermann and A. Schwery, "Cooling 

airflow, losses, and temperatures in large air-cooled synchronous 

machines," IEEE Transactions on Industrial Electronics, vol. 57, 

no. 1, pp. 172-180, Jan. 2010. 

[3] H. Herwig, "Kritische Anmerkungen zu einem weitverbreiteten 

Konzept: der Wärmeübergangskoeffizient a," Forschung im 

Ingenieurwesen, vol. 63, pp. 13-17, 1997. 

[4] A. Boglietti and A. Cavagnino, "Analysis of the endwinding 

cooling effects in TEFC induction iotors," IEEE Transactions on 

Industry Applications, vol. 43, no. 5, pp. 1214-1222, Sept.-Oct. 

2007. 

[5] A. Boglietti, A. Cavagnino, D. Staton and M. Popescu, 

"Experimental assessment of end region cooling arrangements in 

induction motor endwindings," IET Electric Power Applications, 

vol. 5, no. 2, pp. 203-209, Feb. 2011. 

[6] C. Micallef, S. Pickering, K. Simmons and K. Bradley, "Improved 

cooling in the end region of a strip-wound totally enclosed fancooled 

induction electric machine," IEEE Transactions on 

Industrial Electronics, vol. 55, no. 10, pp. 3517-3524, Oct. 2008. 

[7] M. Hettegger, B. Streibl, O. Bíró and H. Neudorfer, "Identifying 

the heat transfer coefficients on the end-winding of an electrical 

machine by measurements and simulations," in 19th ICEM, Rome, 

2010. 

[8] W. P. Jones and B. E. Launder, "The prediction of laminarization 

with a two-equation model of turbulence," International Journal of 

Heat and Mass Transfer, vol. 15, no. 2, pp. 301-314, Feb. 1972. 

[9] F. R. Menter, "Two-equation eddy-viscosity turbulence models for 

engineering applications," AIAA Journal, vol. 32, pp. 1598-1605, 

1994. 

[10] W. Vieser, T. Esch and F. Menter, "Heat transfer predictions using 

advanced two-equation turbulence models," CFX Technical 

Memorandum, vol. CFX-VAL10/0602, 2002. 

[11] "ANSYS 13.0 documentation," ANSYS, Inc., Canonsburg, 2010. 

[12] F. Kreith, R. M. Manglik and M. S. Bohn, Principles of Heat 

Transfer, 7 ed., Stamford: Cengage Learning, 2011. 

[13] "ANSYS ICEM CFD 13.0 documentation," ANSYS, Inc., 

Canonsburg, 2010. 

[14] M. Schrittwieser, A. Marn, E. Farnleitner and G. Kastner, 

"Numerical analysis of heat transfer and flow of stator duct 

models," in 20th ICEM, Marseille, 2012.


Numerical Investigation of Linear Systems Obtained 

by Extended Element-Free Galerkin Method 

Taku Itoh∗ , Soichiro Ikuno∗ , and Atsushi Kamitani † 

∗ Tokyo University of Technology, 1404-1 Katakura, Hachioji, Tokyo 192-0982, Japan 

† Yamagata University, 4-3-16 Johnan, Yonezawa, Yamagata, 992-8510, Japan 

E-mail: taku@m.ieice.org 

Abstract—To impose not only the essential boundary condition but also the natural one without any integrations, the 

Element-Free Galerkin method (EFG) has been reformulated, and this method is called an eXtended EFG (X-EFG). A 

linear system obtained by the X-EFG becomes an asymmetric saddle point problem. Numerical experiments show that, 

by using IC-Bi-CGSTAB and IC-GMRES(m), the linear system can be solved more than 9 times faster than the LU 

factorization in relatively large problem. However, there are some cases where these iterative methods sometimes do not 

converge regardless of the condition number. To avoid these cases, and to stably solve the linear system as fast as possible, 

a flow chart for choosing an appropriate solver has been constructed by using the results of the numerical experiments. 

Index Terms—Meshless methods, Element-free Galerkin methods, Saddle point problems, Asymmetric linear systems 


Meshless methods such as the Element-Free Galerkin 

method (EFG) [1] and the Meshless Local Petrov- 

Galerkin method (MLPG) [2] have widely been applied 

to numerical simulations in a lot of fields, including 

electromagnetics [3], [4], [5], [6], [7]. In the meshless 

methods, elements of a geometrical structure are no 

longer necessary. 

Especially in the EFG, the Lagrange multiplier [1] is 

employed for imposing the essential boundary condition. 

Recently, the EFG has been reformulated together with 

anewimposing method of the essential boundary condition 

[8]. In this EFG, the essential boundary condition 

can be satisfied without using any integrations. However, 

it must be noted here that, in this EFG, the natural 

boundary condition is imposed by evaluating integrations. 

Especially for a three-dimensional (3D) problem, surface 

integrals must be evaluated for imposing the natural 

boundary condition. For this reason, if there exists an 

easier method for imposing the natural boundary condition, 

the method is helpful for developing a numerical 

code based on the EFG. 

The purpose of the present study is to reformulate 

a 3D EFG so that not only the essential boundary 

condition but also the natural one can be imposed without 

any integrations. The reformulated method is called an 

eXtended EFG (X-EFG). A linear system obtained by 

the X-EFG becomes a saddle point problem, and the 

coefficient matrix is asymmetric. For the purpose of 

stably obtaining a numerical solution as fast as possible, 

appropriate solvers for the asymmetric linear system are 

also investigated numerically. 

II. EXTENDED ELEMENT-FREE GALERKIN METHOD 

A. New Reformulation 

In this section, we describe a new reformulation of 

EFG which is different from that described in [8]. For 

simplicity, we consider a 3D Poisson problem: 

−Δu = f in V, (1) 

u =ū on SD, (2) 

∂u 

∂n =¯q on SN, (3) 

where V is a region bounded by a simple closed surface 

∂V that consists of both SD and SN. Here, SD and SN 

satisfy SD ∪ SN = ∂V and SD ∩ SN = φ. In addition, ū 

and ¯q are known functions on SD and SN, respectively, 

and n is an outward unit normal on ∂V . Furthermore, 

f(x) is a given function on V , and x =[x, y, z] T . 

From (1), the weak form is derived as 

 

 

∀ 

w s.t. w∂V : ∇w ·∇u d 

=0 3 

x = wf d 3 x . (4) 

V 

where w(x) is a test function. Note that the constraint of 

w(x) in (4) is different from that in [8]. 

To discretize (4), the nodes, x1, x2,...,xN are first 

placed both in V and on ∂V , and shape functions 

φ1(x),φ2(x),...,φN(x) are determined by using the 

Moving Least-Squares (MLS) approximation [1], [2], [7]. 

Here, N is the number of nodes in V ∪ ∂V . In the 

following, M denotes the number of nodes on ∂V .In 

addition, the orthonormal system in R N and that in R M 

are denoted by {e1, e2,...,eN } and {ē1, ē2,...,ēM }, 

respectively. 

Let us first discretize the weak form (4). To this end, 

we assume that u and w can be expanded with φi(x)(i = 

1, 2,...,N) as follows: 

N 

N 

u(x) = ûiφi(x), w(x) = ˆwiφi(x). (5) 

i=1 

By substituting (5) into (4), we obtain 

where 

i=1 

( ˆw,Aû − f) =0, (6) 

ˆw =[ˆw1, ˆw2,..., ˆwN ] T , (7) 

V

and 

û =[û1, û2,...,ûN] T . (8) 

In addition, A and f are defined as 

N N 

 

A ≡ ∇φi ·∇φj d 3 xeie T j , (9) 

f ≡ 

i=1 j=1 

N 

 

V 

φif d 3 xei . (10) 

i=1 V 

Next, the constraint w| ∂V =0 in (4) is discretized. To 

this end, the constraint is rewritten as the equivalent 

proposition: 

 

∀ 

β(s, t) : 

∂V 

β(s, t)w(x(s, t)) dS =0, (11) 

and an arbitrary function β(s, t) is assumed to be 

contained in span(N1,N2,...,NM ), where N1(s, t), 

N2(s, t),...,NM (s, t) are linearly independent functions 

on ∂V . Here, s and t are parameters for representing 

∂V . By using N1(s, t),N2(s, t),...,NM (s, t), (11) can 

be discretized as 

( ˆw, ck) =0(k =1, 2,...,M), (12) 

where ck (k =1, 2,...,M) are defined as 

N 

 

ck ≡ Nk(s, t)φi(x(s, t)) dS ei. (13) 

i=1 

∂V 

Note that (12) indicate ˆw ∈ V ⊥ , where 

V = span(c1, c2,...,cM ). (14) 

Hence, the weak form (4) can be discretized as 

∀ ˆw ∈ V ⊥ :(ˆw,Aû − f) =0. (15) 

Since (V ⊥ ) ⊥ = V, (15) can be written as 

Aû − f ∈ V. (16) 

Therefore, there exists ˆv ∈ R M such that 

Aû + C ˆv = f, (17) 

where C ≡ [c1, c2,...,cM ] and ˆv ≡ [ˆv1, ˆv2,...,ˆvM ] T . 

Finally, the essential boundary condition (2) and 

the natural one (3) are simultaneously discretized. By 

the similar procedures for discretizing the constraint 

w| ∂V =0 , (2) and (3) can be discretized as 

D T û = g, (18) 

where D ≡ [d1, d2,...,dM ], and 

⎧ 

N 

 

Nk(s, t)φi(x(s, t)) dS ei, 

⎪⎨ i=1 ∂V 

for xk ∈ SD, 

dk ≡ N 

 

Nk(s, t) 

⎪⎩ i=1 ∂V 

∂φi 

(x(s, t)) dS ei, 

∂n 

for xk ∈ SN 

(k =1, 2,...,M). (19) 


In addition, g ≡ [g1,g2,...,gM ], and 

⎧ 

⎪⎨ Nk(s, t)ū(s, t) dS, for xk ∈ SD, 

SD 

gk ≡ 

⎪⎩ Nk(s, t)¯q(s, t) dS, for xk ∈ SN 

SN 

(k =1, 2,...,M). (20) 

Note that, for xk ∈ SD, dk is exactly the same asck. 

Equations (17) and (18) can be written in the form, 

 

A C 

DT 

û f 

= . (21) 

O ˆv g 

Equation (21) is a discretized form of the Poisson problem. 

Throughout this subsection, the reformulation of 

EFG is finished. In the following, the reformulated EFG 

is referred to as an eXtended EFG (X-EFG). 

B. Selection of linearly independent functions 

As mentioned above, the linearly independent functions 

Nk (k =1, 2,...,M) are required for discretizing 

the essential and natural boundary conditions. Here, the 

δ-functions defined on ∂V are employed as Nk (k = 

1, 2,...,M) so that the these boundary conditions may 

be satisfied exactly. The explicit form of Nk(s, t) is given 

as 

Nk(s, t) = δ(s − sk)δ(t − tk) 

 

 

 

∂x ∂x 

(k =1, 2,...,M). 

× 

∂s ∂t 

(22) 

Note that, on ∂V , the kth boundary node xk is represented 

by sk and tk, i.e., xk = x(sk,tk). By using (22), 

C, dk and gk(k =1, 2,...,M) can be rewritten as 

C = 

N 

M 

φi(x(sk,tk))eiē 

i=1 k=1 

T 

k , (23) 

⎧ 

N 

⎪⎨ φi(x(sk,tk))ei, for xk ∈ SD, 

dk = i=1 

N ∂φi 

⎪⎩ 

∂n 

i=1 

(x(sk,tk))ei, 

(24) 

for xk ∈ SN, 

 

ū(sk,tk), for xk ∈ SD, 

gk = 

(25) 

¯q(sk,tk), for xk ∈ SN. 

It must be noted here that, in the X-EFG, the coefficient 

matrix is not symmetric except for the case where 

∂V = SD. However, the essential and natural boundary 

conditions can easily be imposed, since C, D and g can 

be evaluated without any integrations. 

III. SOLVING LINEAR SYSTEM (21) 

A linear system that has a coefficient matrix of a 2 × 

2 block structure as in (21) are called a saddle point 

problem. In this section, we consider solving the saddle 

point problem (21). For the following discussion, (21) is 

rewritten as 

Aˆx = b, (26)

where 

A≡ 

A C 

D T O 

 

û 

, ˆx ≡ 

ˆv 

 

f 

, and b ≡ 

g 

 

. (27) 

A. Direct solvers 

As a direct solver for saddle point problems, there is a 

method that utilizes the 2 × 2 structure of A [9]. In this 

method, A is decomposed by the Cholesky factorization. 

Since A is a main part of A, the computational cost may 

be decreased by using this method in comparison with 

that of the Gaussian elimination. However, we do not 

employ this method for solving (21). This is because 

there were some cases that the Cholesky factorization 

of A was failed in preliminary numerical experiments. 

Thus, we consider that the method in [9] is not stable 

for solving (21) in this problem. 

It must be noted here that A can be decomposed by the 

LU factorization. Hence, we adopt the LU factorization as 

a direct solver. In addition, an ordering method is used to 

decrease fill-ins before the LU factorization is executed. 

B. Iterative Schemes for Solving Saddle Point Problems 

As an iterative scheme for solving saddle point problems, 

Uzawa’s method [10] is well known. Starting with 

initial guesses û0 and ˆv0, Uzawa’s method consists of 

the following coupled iteration: 

Aûk+1 = f − C ˆvk, (28) 

ˆvk+1 = ˆvk + ω(D T ûk+1 − g), (29) 

where ω>0 is a relaxation parameter. In (28), a linear 

system depending on the size of A must be solved. 

Hence, if the size of A is large, the computational cost 

for solving (28) may be expensive. 

On the other hand, the Arrow-Hurwicz method [10] 

is also well known as an inexpensive iterative scheme 

in comparison with the Uzawa’s method. Starting with 

initial guesses û0 and ˆv0, the Arrow-Hurwicz method 

consists of the following coupled iteration: 

ûk+1 = ûk + α(f − Aûk − C ˆvk), (30) 

ˆvk+1 = ˆvk + ω(D T ûk+1 − g), (31) 

where α is also a relaxation parameter. The Arrow- 

Hurwicz method is useful for the case where the size of 

A is large. This is because a linear system do not exist 

in this iteration. This iteration can be written in terms 

of a matrix splitting A = P−Q, i.e., as the fixed-point 

iteration, 

P ˆxk+1 = Qˆxk + b, (32) 

where 

 

1 

P≡ αI O 

DT − 1 

ω I 

 

1 

, Q≡ αI − A −C 

O − 1 

ω I 

 

, (33) 

and ˆx T k ≡ ûT k ˆvT 

k . In 3D problems, the size of A tends 

to be large. Thus, we adopt the Arrow-Hurwicz method 

as an iterative scheme for solving (21). 


C. Preconditioned Krylov Subspace Methods 

For asymmetric linear systems, the incomplete LU 

factorization (ILU) [10] is well known as a preconditioner 

for Krylov subspace methods. Although ILU can be applied 

to (21), we do not employ ILU. This is because the 

coefficient matrix of (21) is almost symmetric. Namely, 

we consider utilizing the matrix property. 

To utilize “almost symmetric”, we adopt the incomplete 

Cholesky factorization (IC) [11] as a preconditioner 

for Krylov subspace methods. To this end, we propose 

a strategy for generating preconditioned matrices. In 

this strategy, preconditioned matrices LDLT of IC are 

generated as 

 

A D 

DT 

LDL 

O 

T , (34) 

where L is a lower triangular matrix, and D is a diagonal 

matrix. In (34), we assume 

 

A C 

A = 

DT 

A D 

 

O DT 

. (35) 

O 

As mentioned above, if xk ∈ SD, the kth column ofD 

is exactly the same as that of C. In addition, there is 

no difference between the matrix A of (21) and that of 

(34). Hence, we consider that the assumption (35) can 

be acceptable. Note that we adopt an algorithm of IC in 

which matrices LDL T are as sparse as the matrix A [11]. 

Even we use IC as a preconditioner, Krylov subspace 

methods for asymmetric linear systems have to be chosen. 

As Krylov subspace methods for asymmetric linear 

systems, Bi-CGSTAB [12] and GMRES(m) [13] are well 

known and these iterative methods have produced a lot 

of attractive results [14]. Here, m is some fixed integer 

parameter, and GMRES(m) restarts every m steps [13]. 

For solving (21), we adopt both methods with IC. In the 

following, Bi-CGSTAB with IC and GMRES(m) with IC 

are referred to as IC-Bi-CGSTAB and IC-GMRES(m), 

respectively. 

IV. NUMERICAL EXPERIMENTS 

In this section, some numerical solvers as chosen 

in Section III are applied to a linear system (21). To 

generate the linear system (21), a 3D Poisson problem is 

discretized by using the X-EFG. 

Throughout the present section, the region V is assumed 

as V =(−0.5, 0.5) × (−0.5, 0.5) × (−0.5, 0.5). 

In addition, the natural boundary condition is imposed 

on the surface SN defined as −0.25 ≤ x ≤ 0.25, 

−0.25 ≤ y ≤ 0.25 and z = 0.5, and the essential 

boundary condition is imposed on SD ≡ ∂V − SN. 

Moreover, the functions f(x), ū and ¯q are determined 

so that the analytic solution of the 3D Poisson problem 

may be u =exp(−x 2 − y 2 − z 2 ). 

The boundary nodes x1, x2,...,xM are uniformly 

placed on ∂V , and the nodes xM+1, xM+2,...,xN are 

also uniformly placed in V . In addition, the exponential

Fig. 1. Dependence of the relative error on the size of coefficient 

matrix. In this figure, u e and u n are exact and numerical solutions, 

respectively. 

weight function [1], 

⎧ 

⎨exp[−(r/c) 

w(r) ≡ 

⎩ 

2 ] − exp[−(R/c) 2 ] 

1 − exp[−(R/c) 2 (r ≤ R), 

] 

(36) 

0 (r>R), 

is adopted for the MLS approximation. Here, R denotes a 

support radius, and c is a user-specified parameter. We set 

R =1.9h and c = h, where h is the minimum distance 

between two nodes. 

In the MLS approximation, the shape functions 

φi(x) (i =1, 2,...,N) can be determined by 

φi(x) =p T (x)B −1 (x)bi(x), (37) 

where p T (x) =[1xyz]. In addition, the matrix B(x) 

and the vector bi(x) are defined as 

B(x) = 

N 

wk(x)p(xk)p T (xk), (38) 

k=1 

bi(x) =wi(x)p(xi), (39) 

where wi(x) =w(|x−xi|). In (9), the partial derivatives 

of φi(x) by X(= x, y, and z) can be obtained as 

where 

φi,X(x) =p T X(x)B −1 (x)bi(x) 

+p T (x)[B −1 

X (x)bi(x)+B −1 (x)bi,X(x)], (40) 

B −1 

X (x) =−B−1 (x)BX(x)B −1 (x). (41) 

For evaluating (9) and (10), a cubic cell structure being 

independent of the nodes is used [1], [7], and the Gauss- 

Legendre quadrature is employed. The number NQ of 

quadrature points depends on the number m of nodes in 

a cell. Throughout this section, NQ is handled on the 

similar criterion in [1], i.e., NQ = nQ × nQ × nQ, where 

nQ = ⌊ √ m +0.5⌋ +2. In addition, the number NC of 

cells is set as NC = mC × mC × mC, where mC = 

⌊N 1/3 ⌋. 

As a LU factorization, we adopt the sequential SuperLU 

[15]. In addition, the Column Approximate Minimum 

Degree Ordering (COLAMD) [16] is employed 


Fig. 2. Dependence of the computational time for solving (21) on the 

size of coefficient matrix. 

as an ordering method. This ordering method can easily 

be used by setting options.ColPerm = COLAMD in 

the SuperLU. For the Arrow-Hurwicz method, we set 

α =1.5 and ω =0.05. In addition, for IC-GMRES(m), 

we set m = 200. For all iterative solvers, an initial guess 

of ˆx in (26) is set as ˆx0 = 0. 

Computations were performed on a computer equipped 

with a 2.66GHz Intel Core i7 920 processor, 24GB RAM, 

Ubuntu Linux ver. 12.04, and g++ ver. 4.6.3. Note that we 

only used a single core of this processor in the following 

experiments. Compiler options were set as “-O3 -Wall 

-m64” for all solvers. 

A. Determining εtol for Iterative Solvers 

For the Arrow-Hurwicz method, the iteration is repeated 

until ||ˆxk+1 − ˆxk||/||ˆxk+1|| ≤ εtol is satisfied, 

where k is the iteration number and ˆxk is the approximate 

solution in the kth iteration. Also, for IC-Bi- 

CGSTAB and IC-GMRES(m), the iteration is repeated 

until ||rk+1||/||b|| ≤ εtol is satisfied, where rk+1 is the 

(k +1)th residual vector that can be obtained in the 

algorithm of Krylov subspace methods. Note that the 

maximum norm is adopted for the definition of ||·||. 

To determine εtol, the dependence of relative error on 

the size of coefficient matrix is shown in Fig. 1. Here, 

the relative error ε ≡||ue−un ||/||ue ||, where ue and un are the exact and numerical solutions of u, respectively. 

In addition, by the first equation of (5), un is evaluated 

with û that is determined by the LU factorization. From 

Fig. 1, we see ε>10−4 . Hence, we consider that, in this 

problem, εtol =10−8is sufficient for obtaining un that 

has almost the same accuracy shown in Fig. 1. 

B. Performance of Direct and Iterative Solvers 

Let us first investigate the performance of the LU factorization, 

the Arrow-Hurwicz method, IC-Bi-CGSTAB 

and IC-GMRES(m). To this end, the dependence of 

the computational time for solving (21) by using these 

methods on the size of coefficient matrix is shown in

(a) 

(b) 

Fig. 3. Histories of the relative residual for IC-Bi-CGSTAB and IC- 

GMRES(m), and those of the relative error for the Arrow-Hurwicz 

method. (a) and (b) are for N + M = 19083 and 42083, respectively. 

Fig. 2. We see from this figure that the computational 

time of the LU factorization is less than that of the 

Arrow-Hurwicz method. In addition, from this figure, 

there is no obvious difference between the computational 

time of IC-Bi-CGSTAB and that of IC-GMRES(m), and 

the computational time of both methods is less than that 

of the LU factorization. Especially for the case where the 

size N + M of the coefficient matrix is relatively large, 

the computational time can be decreased by using both 

methods, e.g., for N + M = 42083, IC-BiCGSTAB and 

IC-GMRES(m) are about 9 and 15 times faster than the 

LU factorization. From these results, we consider that the 

strategy described in (34) works well for solving (21). 

It must be noted here that the Arrow-Hurwicz method 

does not converge for N +M = 42083. Similarly, IC-Bi- 

CGSTAB and IC-GMRES(m) do not converge for N + 

M = 19083. Thus, in Fig. 2, there is no data for these 3 

cases. Hence, the iterative solvers are not always stable 

in this problem. 

To investigate behavior of the iterative solvers for 

N + M = 19083 and 42083, histories of the relative 

residual ||rk+1||/||b|| for Krylov subspace methods, and 

those of the relative error ||ˆxk+1 − ˆxk||/||ˆxk+1|| for 


Fig. 4. Dependence of the condition number of A on the size of 

coefficient matrix. 

the Arrow-Hurwicz method are shown in Fig. 3. We 

see from Fig. 3(a) that IC-BICGSTAB rapidly diverges 

and IC-GMRES(m) oscillates. In addition, the Arrow- 

Hurwicz method converges though the convergence speed 

is slow. Indeed, the iteration number for the Arrow- 

Hurwicz method is 35324 when ||ˆxk+1 − ˆxk||/||ˆxk+1|| 

is satisfied. From Fig. 3(b), we see that IC-Bi-CGSTAB 

and IC-GMRES(m) converge rapidly. In addition, the 

relative residual of IC-GMRES(m) is stably decreased 

until restarting. For N +M = 42083, the Arrow-Hurwicz 

method does not converge, even after more than 500000 

iterations. 

Next, we investigate a property of the coefficient matrix 

of (21). To this end, the dependence of the condition 

number of A on the size of coefficient matrix is shown 

in Fig. 4. We see from this figure that the condition 

numbers are not very large, even for N + M = 19083 

and 42083. Hence, from the condition numbers, it is 

difficult to obtain the reason why the Krylov subspace 

methods and the Arrow-Hurwicz method do not converge 

for N + M = 19083 and 42083, respectively. 

From these results, it is difficult that we recognize an 

appropriate solver in advance. Hence, to stably solve (21) 

as fast as possible, we suggest that IC-GMRES(m) is 

first used in order to choose an appropriate solver. After 

ˆn iterations of IC-GMRES(m), if 

||rk+1||/||b|| ≤ ˆεtol 

(42) 

is not satisfied, then it is recognized that IC-GMRES(m) 

will not converge. In this case, the iteration of IC- 

GMRES(m) is finished, and (21) is solved by the LU 

factorization. If (42) is satisfied after ˆn iterations, then it 

is recognized that IC-GMRES(m) will converge. Hence, 

in this case, the iteration of IC-GMRES(m) is continued. 

We consider that ˆεtol =10 −2 and ˆn = max(30, (N + 

M)/500) are reasonable choice for this problem. 

Although the convergence speed is slow, the Arrow- 

Hurwicz method may work for the case where not only 

Krylov subspace methods do not converge but also the 

LU factorization cannot execute. This may occur when 

N + M is too large.

The above suggestions for choosing solvers are summarized 

as a flow chart shown in Fig. 5. Note that, in 

this flow chart, IC-Bi-CGSTAB can be used instead of 

IC-GMRES(m) by setting ˆεtol and ˆn appropriately. This 

is because, in Fig. 2, when IC-GMRES(m) converges, 

IC-Bi-CGSTAB also converges, and there is no obvious 

difference between the computational time of IC- 

GMRES(m) and that of IC-Bi-CGSTAB. 

V. CONCLUSION 

To impose not only the essential boundary condition 

but also the natural one without any integrations, the EFG 

has been reformulated, and this method is called a X- 

EFG. A linear system obtained by the X-EFG becomes 

an asymmetric saddle point problem. To investigate appropriate 

solvers for this problem, the linear system that 

is obtained from a 3D Poisson problem discretized by 

the X-EFG has been solved by the LU factorization, 

the Arrow-Hurwicz method, IC-Bi-CGSTAB, and IC- 

GMRES(m) in the numerical experiments. Conclusions 

obtained in the present study are summarized as follows: 

1) By using the X-EFG, the essential and natural 

boundary conditions can be imposed without any 

integrations. 

2) Although the linear system obtained by the X- 

EFG has the asymmetric coefficient matrix, the 

incomplete Cholesky factorization works well as a 

preconditioner for Bi-CGSTAB and GMRES(m). 

3) By using IC-BiCGSTAB and IC-GMRES(m), the 

linear system can be solved faster than the LU 

factorization in relatively large problems. However, 

these iterative methods sometimes do not converge 

regardless of the condition number. 

4) To stably solve the linear system as fast as possible, 

an appropriate solver can be chosen by the flow 

chart shown in Fig. 5. 

As future work, the X-EFG will be applied to more 

practical problems in various fields, including electromagnetics. 

ACKNOWLEDGMENTS 

This work was partially supported by JSPS KAKENHI 

Grant Numbers 24700053 and 22360042. 

REFERENCES 

[1] T. Belytschko, Y. Y. Lu, and L. Gu, “Element-free Galerkin 

methods,” Int. J. Numer. Methods Eng., vol. 37, pp. 229–256, 

1994. 

[2] S. N. Atluri and T. Zhu, “A new meshless local Petrov-Galerkin 

(MLPG) approach in computational mechanics,” Comput. Mech., 

vol. 22, pp. 117–127, 1998. 

[3] A. Manzin, D. P. Ansalone, and O. Bottauscio, “Numerical 

modeling of biomolecular electrostatic properties by the elementfree 

Galerkin method,” IEEE Trans. on Magn., vol. 47, no. 5, pp. 

1382–1385, 2011. 

[4] S. Ikuno, T. Hanawa, T. Takayama, and A. Kamitani, “Evaluation 

of parallelized meshless approach: Application to shielding 

current analysis in HTS,” IEEE Trans. on Magn., vol. 44, pp. 

1230–1233, 2008. 


✓ 

Start 

✒ 

✏ 

✑ 

❄ 

Iteration of IC-GMRES(m) is repeated 

until iteration number k =ˆn. 

❄ 

✟ ✟✟✟✟✟ 

❍ 

❍ 

❍ 

❍ 

||rk+1|| ❍ 

❍ ≤ ˆεtol? ❍Yes 

❍❍❍❍❍ ||b|| ✟ 

✟ 

✟ 

✟ 

✟ 

✟ 

No ✓ ❄ 

IC-GMRES(m) 

✒ 

❄ 

✟ ✟✟✟✟✟ 

❍ 

❍ 

❍ 

❍ 

❍ 

❍ Is N + M too large? ❍Yes 

✟ 

❍❍❍❍❍ 

✟ 

✟ 

✟ 

✟ 

✟ No ✓ ❄ 

✓ ❄ 

Arrow-Hurwicz 

✒ 

✏ 

LU factorization 

✒ 

✑ 

✏ 

✑ 

✏ 

✑ 

Fig. 5. A flow chart for choosing an appropriate solver. Note that 

IC-Bi-CGSTAB can be used instead of IC-GMRES(m) in this flow 

chart. 

[5] G. F. Parreira, E. J. Silva, A. Fonseca, and R. Mesquita, “The 

element-free Galerkin method in three-dimensional electromagnetic 

problems,” IEEE Trans. on Magn., vol. 42, no. 4, pp. 711– 

714, 2006. 

[6] G. Ni, S. L. Ho, S. Yang, and P. Ni, “Meshless local Petrov- 

Galerkin method and its application to electromagnetic field 

computations,” International Journal of Applied Electromagnetics 

and Mechanics, vol. 19, pp. 111–117, 2004. 

[7] G. R. Liu, Meshfree Methods: Moving beyond the Finite Element 

Method (2nd Edition). Boca Raton: CRC Press LLC, 2009. 

[8] A. Kamitani, T. Takayama, T. Itoh, and H. Nakamura, “Extension 

of meshless Galerkin/Petrov-Galerkin approach without using 

Lagrange multipliers,” Plasma and Fusion Research, vol. 6, no. 

2401074, 2011. 

[9] J. Zhao, “The generalized Cholesky factorization method for 

saddle point problems,” Applied Mathematics and Computation, 

vol. 92, pp. 49–58, 1998. 

[10] Y. Saad, Iterative Methods for Sparse Linear Systems, 2nd Edition. 

Philadelphia: SIAM, 2003. 

[11] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd 

Edition. Baltimore and London: Johns Hopkins University Press, 

1996. 

[12] H. A. van der Vorst, “Bi-CGSTAB: A fast and smoothly converging 

variant of Bi-CG for the solution of nonsymmetric linear 

systems,” SIAM J. Sci. Stat. Comput., vol. 13, no. 2, pp. 631–644, 

1992. 

[13] Y. Saad and M. H. Schultz, “GMRES: A generalized minimal 

residual algorithm for solving nonsymmetric linear systems,” 

SIAM J. Sci. Stat. Comput., vol. 7, no. 3, pp. 856–869, 1986. 

[14] H. A. van der Vorst, Iterative Krylov Methods for Large Linear 

Systems (Cambridge Monographs on Applied & Computational 

Mathematics). Cambridge: Cambridge University Press, 2003. 

[15] J. W. Demmel, S. C. Eisenstat, J. R. Gilbert, X. S. Li, and J. W. H. 

Liu, “A supernodal approach to sparse partial pivoting,” SIAM J. 

Matrix Analysis and Applications, vol. 20, pp. 720–755, 1999. 

[16] T. A. Davis, J. R. Gilbert, S. Larimore, and E. Ng, “A column 

approximate minimum degree ordering algorithm,” ACM Trans. 

Mathematical Software, vol. 30, pp. 353–376, 2004.


Electromagnetic Wave Propagation Simulation 

in Corrugated Waveguide using Meshless Time 

Domain Method 

Soichiro Ikuno∗ , Yoshihisa Fujita∗ , Taku Itoh∗ , Susumu Nakata † and Atsushi Kamitani ‡ 

∗Tokyo University of Technology, 1404-1 Katakura, Hachioji, Tokyo 192-0982, Japan 

† Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga 525-8577, Japan 

‡ Yamagata University, 4-3-16 Johnan, Yonezawa, Yamagata, 992-8510, Japan 

E-mail: s.ikuno@ieee.org 

Abstract—The simulation of the electromagnetic wave propagation in complex shaped corrugated waveguide using Meshless 

Time Domain Method (MTDM) based on the Radial Point Interpolation method is numerically investigated. MTDM does 

not require finite elements or meshes of a geometrical structure as well as other meshless method. In MTDM, only the 

necessary information is the location of nodes, and the arrangement of the node structure of electric fields and magnetic 

fields. By using the simulation code for analyzing a magnetic wave propagation phenomenon in a complex shaped waveguide, 

the influence of node alignment on a wave propagation is numerically evaluated. Moreover, the influence of frequencies 

and pitch of corrugate on the dumping rate is evaluated. The results of computation show that the node alignment based 

on the staggered grid that is generally used in standard FDTD is suitable for the numerical calculation. In addition, the 

relationship between a pitch of corrugate and a frequency is numerically evaluated. 

Index Terms—FDTD, RPIM, wave propagation, corrugated waveguide 


In the Large Helical Device (LHD), the electron cyclotron 

heating device is used for plasma heating. The 

electrical power that is made by the gyrotron system 

transmits to LHD by using long corrugated waveguide. 

However, it is not clear that the shape of curvature of the 

waveguide or transmission gain of electromagnetic wave 

propagation theoretically. 

Finite Difference Time Domain (FDTD) method has 

provided the solution of Maxwell’s equation directly, 

and the method is applied for electromagnetic wave 

propagation simulation frequently [1], [2]. Furthermore, 

FDTD method has great advantages in terms of parallelization 

and treatment of problems and so on. However, 

the numerical domain should be divided into rectangle 

meshes if FDTD method is applied for the simulation, 

and it is difficult to treat the problem in the complex 

domain. 

As is well known that the meshless approach does 

not require finite elements or meshless of a geometrical 

structure. And various meshless approaches such as the 

diffuse element method [3], the element-free Galerkin 

(EFG) method [4] and the meshless local Petrov-Galerkin 

(MLPG) [5] method and the radial point interpolation 

method (RPIM) has been developed [6]. And these 

methods are applied to a variety of engineering fields 

and the fields of computational magnetics [7], [8], [9]. 

Particularly, meshless approaches based on RPIM are 

applied to time dependent problems [10]. Meshless Time 

Domain Method (MTDM) [11] does not require finite 

elements or meshes of a geometrical structure as well as 

other meshless method. In MTDM, only the necessary 

information is the location of nodes, and the arrangement 

of the node structure of electric fields and magnetic 

fields. Thus, MTDM can be easily applied for the time 

dependent simulation of the problem in the complex 

shaped domain. 

The purpose of the present study is to develop the 

numerical code for analyzing electromagnetic wave propagation 

in corrugated waveguide, and to investigate the 

optimal shape of corrugated waveguide. 

II. SHAPE FUNCTION BASED ON RPIM 

First, we scatter N nodes x1, x2, ··· , xN in the target 

domain and the boundary, and assign the Radial Basis 

Function (RBF) w1(x),w2(x), ··· ,wN (x) with compact 

support to the nodes. Then, the solution u(x) can 

be expanded as 

u(x) =[w(x) T , p(x) T ]G −1 

 

u 

= φ(x)u, (1) 

0 

where the vector w(x), p(x), u(x) and φ(x) are defined 

by 

w(x) =[w1(x),w2(x), ··· ,wN (x)] T , (2) 

p(x) =[p1(x),p2(x), ··· ,pM(x)] T , (3) 

u =[u1,u2, ··· ,uN ] T , (4) 

φ(x) =[φ1(x),φ2(x), ··· ,φN(x)] T . (5) 

where φi(x) denotes a shape function on i−th node. 

The components of the vector p(x) are monomials of 

the space variables. For example, p(x) T =[1,x,y] and 

p(x) T =[1,x,y,x2 ,xy,y2 ] are monomials for the linear 

and the quadratic approximation. Furthermore, the matrix 

G is defined by following equation. 

 

W P 

G = 

P T 

, (6) 

O

Here, the matrices W and P are defined by following 

equations. 

W =[w(x1), w(x2), ··· , w(xn)] T , (7) 

P =[p(x1), p(x2), ··· , p(xn)] T . (8) 

In the present study, following three functions are 

adopted for the weight function. 

⎧ 

⎨ e 

wi(xj) = 

⎩ 

−(r/c)2 − e−(R/c)2 1.0 − e−(R/c)2 , r < R, 

(9) 

0, r ≥ R, 

 

r 

2 wi(xj) =1.0− 6.0 

R 

 

r 

3 

r 

4 +8.0 − 3.0 

(10) 

R R 

r 

 

2 

−0.5 

wi(xj) = +1.0 

(11) 

R 

Here, R denotes a support radius of the influence domain 

and c denotes a constant. Moreover, r is defined by 

r = |x − xi|. Under the above assumptions, the shape 

function and its derivative can be expressed as 

N 

M 

φk(x) = wi(x)gi,k + pj(x)gN+j,k, (12) 

∂φk 

∂x = 

∂φk 

∂y = 

N 

i=1 

N 

i=1 

i=1 

∂wi(x) 

∂x gi,k + 

∂wi(x) 

gi,k + 

∂y 

j=1 

M 

j=1 

M 

j=1 

∂pj(x) 

∂x gN+j,k, (13) 

∂pj(x) 

gN+j,k, (14) 

∂y 

where, gi,j denotes the (i, j) element of matrix G −1 . 

Note that the shape function satisfy the Kronecker 

delta function property, i.e. 

 

1, i = j, 

φi(xj) = 

(15) 

0, i = j. 

From this property the function can be expanded by using 

the shape function based on RPIM as follows. 

u(xi) = 

N 

φi(xj)ui = ui. (16) 

j=1 

In the next section, Meshless Time Domain Method is 

formulated by using above shape function. 

III. MESHLESS TIME DOMAIN METHOD 

In the present study, 2D electromagnetic wave propagation 

of TM mode is adopted for the evaluation. The 

governing equation of the problem is defined by 

ε ∂Ez 

∂t = −σEz + ∂Hy 

∂x 

μ ∂Hx 

∂t 

μ ∂Hy 

∂t 

∂Hx 

− , (17) 

∂y 

= −∂Ez , (18) 

∂y 

∂Ez 

= , (19) 

∂x 


where, Hx and Hy denote the magnetic field of x and 

y component, and Ez denotes the electric field of z 

component. In addition, ε, σ and μ denote permitivity, 

permeability and electroconductivity, respectively. 

The system is discretized with respect to time by 

applying Leap Frog Method, and it is transformed to 

following equations. 

ε n+1 

Ez − E 

Δt 

n z 

+ σE n+ 1 

2 

z 

= ∂Hn+ 1 

1 

2 

2 

y ∂Hn+ x 

− , 

∂x ∂y 

μ 

 

H 

Δt 

(20) 

n+1/2 

x − H n−1/2 

 

x = − ∂En z 

, 

∂y 

μ 

 

H 

Δt 

(21) 

n+1/2 

y − H n−1/2 

 

y = ∂En z 

. 

∂x 

(22) 

As we mentioned above, the shape function of RPIM 

has the Kronecker delta function property (15). By using 

the shape function and the property, the system can be 

discretized with respect to space as follows. 

E n+1 

 

ε σ 

 

z,i = α − E 

Δt 2 

n z,i 

⎤ 

N 1 n+ 2 

+ H 

N 1 n+ 2 H ⎦ , (23) 

H 

H 

1 n+ 2 

x,i 

1 n+ 2 

y,i 

j=1 

y,j 

1 

2 = Hn− x,i 

1 

2 = Hn− y,i 

Here, φ E i and φH i 

∂φ H j 

∂x − 

Δt 

− 

μ 

Δt 

+ 

μ 

N 

j=1 

N 

j=1 

j=1 

x,j 

E n ∂φ 

z,j 

E j 

∂y 

E n ∂φ 

z,j 

E j 

∂x 

∂φ H i 

∂y 

, (24) 

. (25) 

denote the shape function for electric 

field and magnetic field, and the parameter α is defined 

as following equation. 

Note that, the average of E n z,i 

1 

α = ε σ . (26) 

+ 

Δt 2 

and En+1 z,i is adopted for 

E n+1/2 

z,i . By solving (23), (24) and (25) alternately in 

each time step, we can obtain the result that describes 

the time dependent behavior of the electromagnetic wave 

propagation in various shape of wave guide. 

In the present study, the Perfectly Matched Layer 

(PML) and the Perfect Magnetic Conductor (PMC) are 

used for absorbing boundary condition and boundary 

condition. The electric field of z component is divided 

into 

Ez = Ezx + Ezy, (27) 

where components are governed by following equations. 

jωεEzx + σxEzx = ∂Hy 

, (28) 

∂x 

jωεEzy + σyEzx = − ∂Hx 

. (29) 

∂x

Here, j denotes a imaginary unit, and ω denotes a angular 

frequency. By using (27), the basic governing equation 

of PML is written as follows. 

ε ∂Ezx 

∂t = −σxEzx + ∂Hy 

, (30) 

∂x 

ε ∂Ezy 

∂t = −σyEzy − ∂Hx 

, (31) 

∂x 

μ ∂Hx 

∂t = −σ∗ yHx − ∂Ez 

, (32) 

∂y 

μ ∂Hy 

∂t = −σ∗ yHy + ∂Ez 

. (33) 

∂x 

Here, μ denotes permeability. Taking into account the 

delta function property of the shape function based on 

RPIM, and discretizing respect to time using the Leap- 

Flog method, we can obtain following discretized equations 

for PML 

E n zx,m = 

E n zy,m = 

H 

H 

1 n+ 2 

x,m = 

1 n+ 2 

y,m = 

ε 

Δt 

ε 

Δt 

 

σx 

− E 

2 

n−1 

zx,m + 

N 

H 

i=1 

ε σy 

+ 

Δt 2 

 

σx 

− E 

2 

n−1 

zy,m + 

N 

H 

i=1 

ε σy 

+ 

Δt 2 

 

μ 

Δt − σ∗ 

1 

y n− 2 Hx,m − 

2 

μ 

Δt + σ∗ y 

2 

 

μ 

Δt − σ∗ 

1 

x n− 2 Hy,m + 

2 

μ 

Δt + σ∗ x 

2 

N 

i=1 

N 

i=1 

n− 1 

2 

y,i 

n− 1 

2 

x,i 

∂φ H i 

∂x 

∂φ H i 

∂y 

E n ∂φ 

z,i 

E i 

∂y 

E n ∂φ 

z,i 

E i 

∂x 

, (34) 

, (35) 

, (36) 

, (37) 

where Δt denotes a step size of time and superscript n 

denotes number of steps. 

In MTDM, nodes for electric field and magnetic field 

should be separated, and following four types of node 

alignment is adopted for accuracy evaluation as shown 

in Fig. 1. 

Fist alignment type is based on normal meshless 

method that means a node for electric field and magnetic 

field is located same position as shown in Fig. 1 (a). The 

node for magnetic field located a center of diagonally of 

nodes for electric field in second type as shown in Fig. 

1 (b). Third type is based on staggered grid which is 

generally used in standard FDTD, and fourth type is a 

mixed version of second and third type as shown in Fig. 

1 (c) and (b), respectively. 

IV. INFLUENCE OF NODE ALIGNMENT 

As is well known that FDTD is an explicit method. 

Thus, the method must be satisfies the Courant condition, 


(a) (b) 

(c) (d) 

Fig. 1. The schematic view of four types of node alignment of electric 

field and magnetic field. 

i.e., 

Δt < 1 

v 

1 

 

2 1 

+ 

Δx 

 

1 

Δy 

 

, 

2 

(38) 

where Δx and Δy denote a division size of x and y 

direction, and v denotes a wave speed. On the other 

hand, MTDM has not concept of mesh as we mentioned 

above. Therefore, following criterion is derived for stable 

calculation [11]. 

min |xi − x| 

i 

Δt < 

. (39) 

v 

Here, min |xi − x| denotes a distance of neighboring 

i 

node, and the step size of time Δt is determined so as 

to satisfy the criterion (39). 

To evaluate the influence of node alignment, value of 

the dumping rate RD is introduced. 

 

RD = 

 

Γout 

Γin 

Pz dl 

Pz dl 

(40) 

Here, Pz denote a pointing vector P = B × E of z 

component, and Γin, Γout denote a the source input line 

and the observation line, respectively. We can see from 

above equation, if the value of RD satisfies RD =1, 

the waveguide regards as an ideal zero loss waveguide. 

In addition, physical parameters for the calculation are 

shown in Table I

TABLE I 

PHYSICAL PARAMETERS FOR THE CALCULATION. HERE λ DENOTES 

A WAVE LENGTH. 

Damping rate, R D 

Input Wave Sine wave 

Amplitude 1.0 [V/m] 

Frequency 1.0, 15.0, 30.0 [GHz] 

Wave speed 3.0 × 10 8 [m/s] 

Distance of neighboring node 20/λ 

Number of layer for PML 16 

Dimension of PML 4 

Reflectance factor of PML −80 [dB] 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

(a) 

(b) 

(c) 

0 

0.04 0.042 0.044 0.046 0.048 0.05 0.052 0.054 0.056 0.058 0.06 

Support radius, R 

Fig. 2. The values of dumping rate RD are plotted as a function of 

support radius R in case of the first type node alignment as shown in 

Fig. 1 (a). Note that lines (a), (b) and (c) are evaluated by using Eq. 

(9), (10) and (11), respectively. 

In Fig. 2, 3, 4 and 5, we show the influence of 

support radius R on dumping rate RD with various 

weight functions. Note that a normal line waveguide is 

adopted for the evaluation, and same value of support 

radius R is adopted for electric field shape function and 

magnetic field shape function. In addition, a frequency 

of input wave is fixed as 1 [GHz]. We can see from 

these figures that the values of dumping rate RD are not 

strictly stable in case of spline weight function is adopted 

for the shape function construction. On the other hand, 

if the Gauss type weight function is adopted for weight 

function the value of dumping rate RD generally continue 

to be flat around unit value in case of all the types 

of node alignment. From this result, Gauss type weight 

function (9) is suitable for MTDM weight function, and 

for the rest of this Gauss type weight function is adopted 

for following calculation. Furthermore, we can see from 

these figures that node alignments of third type (see Fig. 

1 (c)) lead us stable calculation. Thus, in the following 

calculation node alignment of third type is adopted. 

V. WAVE PROPAGATION SIMULATION IN 

CORRUGATED WAVEGUIDE 

Let us first show the distribution of electric field in 

curved corrugated waveguide. The schematic view of 

the curved corrugated waveguide which is used in the 

calculation is shown in Fig. 6 (a). The pitch of the 



1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

(a) 

(b) 

(c) 

0 

0.04 0.042 0.044 0.046 0.048 0.05 0.052 0.054 0.056 0.058 0.06 



support radius R in case of the second type node alignment as shown 

in Fig. 1 (b). Note that lines (a), (b) and (c) are evaluated by using Eq. 



1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

(a) 

(b) 

(c) 

0 

0.04 0.042 0.044 0.046 0.048 0.05 0.052 0.054 0.056 0.058 0.06 



support radius R in case of the third type node alignment as shown in 

Fig. 1 (c). Note that lines (a), (b) and (c) are evaluated by using Eq. 



1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

(a) 

(b) 

(c) 

0 

0.04 0.042 0.044 0.046 0.048 0.05 0.052 0.054 0.056 0.058 0.06 



support radius R in case of the fourth type node alignment as shown 

in Fig. 1 (d). Note that lines (a), (b) and (c) are evaluated by using Eq. 

(9), (10) and (11), respectively.

y(m) 

0.4 

0.35 

0.3 

0.25 

0.2 

0.15 

0.1 

0.05 

0 

0 0.05 0.1 0.15 0.2 

x(m) 

(a) (b) 

2.5 

2 

1.5 

1 

0.5 

0 

-0.5 

-1 

-1.5 

-2 

E z 

Fig. 6. (a): The schematic view of the analytic region and (b): the 

distribution of electric field Ez in corrugated waveguide in case of 

W =50mm. 

(a) (b) 

Fig. 7. Analytic models for evaluating dumping rate RD. (a): Line 

wave guide, (b): Curved wave guide 

corrugate made at regular intervals for straight part, and 

unequally-spaced gaps are made on a curved part as 

shown in Fig. 6 (a). The distribution of electric field 

of z component Ez in case of W = 50 mm is also 

shown in Fig. 6 (b). In this figure, the reflected wave 

is observed at the curved part of waveguide. Note that 

the reflected wave increase as the width of waveguide 

W increases. In other words, the damping rate increase 

as the value of W increase, and this phenomenon also 

relevant to wavelength and curvature of waveguide. 

Next, we evaluate the influence of frequencies and 

pitch of corrugate on the dumping rate RD. The analytic 

models for evaluating dumping rate RD are line corrugated 

waveguide (see Fig. 7 (a)) and curved corrugated 

waveguide (see Fig. 7 (b)). The pitch of corrugate shape 

is Cλ where C denotes a constant, and the pitch of 

the corrugate made at regular intervals for straight part 

and unequally-spaced gaps are made on a curved part 

in curved corrugated waveguide as well as previous 

evaluation. 

By using the analytic models, the influence of frequen- 

-2.5 



2 

1.5 

1 

0.5 

1GHz 

15GHz 

30GHz 

0 

0λ 0.2λ 0.4λ 0.6λ 0.8λ 1λ 1.2λ 

Pitch of Corrugated waveguide 

Fig. 8. The influence of frequencies and pitch of corrugate on dumping 

rate RD in the line waveguide. 


2.5 

2 

1.5 

1 

0.5 

1GHz 

15GHz 

30GHz 

0 

0λ 0.2λ 0.4λ 0.6λ 0.8λ 1λ 1.2λ 

Pitch of Corrugated waveguide 

Fig. 9. The influence of frequencies and pitch of corrugate on dumping 

rate RD in the curved waveguide. 

cies and pitch of corrugate D on dumping rate RD in the 

corrugated waveguide is evaluated, and the results are 

shown in Fig. 8 and 9. In the line waveguide, the magnetic 

wave propagates stationary in case of 0.0λ

• The values of dumping rate RD are not strictly stable 

in case of spline weight function is adopted for the 

shape function construction. 

• On the other hand, if the Gauss type weight function 

is adopted for weight function the value of dumping 

rate RD generally continue to be flat around unit 

value in case of all the types of node alignment. 

• The node alignment based on the staggered grid that 

is generally used in standard FDTD should be used 

for MTDM simulation. 

• The reflected wave increase as the width of waveguide 

W increases. In other words, the damping rate 

increase as the value of W increase, and this phenomenon 

also relevant to wavelength and curvature 

of waveguide. 

• In the line corrugated waveguide, the magnetic wave 

propagated stationary in case of 0.0λ


Optimization of Permanent Magnet Linear Actuator 

for Braille Screen 

*Ivan S. Yatchev, *Iosko S. Balabozov, *Krastio L. Hinov, *Vultchan T. Gueorgiev and 

**Dimitar N. Karastoyanov 

* Faculty of Electrical Engineering, Technical University of Sofia, 8, Kliment Ohridsky Blvd., 1000 Sofia, Bulgaria 

** Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev St., 

Block 2, 1113 Sofia, Bulgaria 

E-mail: yatchev@tu-sofia.bg 

Abstract—Permanent magnet linear actuator intended for driving a needle in Braille screen has been optimized. The mover 

of the actuator is a combined one - it consists of permanent magnet and ferromagnetic discs. The optimization is carried out 

with respect to minimal magnetomotive force ensuring required minimum electromagnetic force on the mover. The 

optimization factors are dimensions of the cores and mover parts under additional constraint for overall dimension of the 

actuator. Finite element analysis, response surface methodology and design of experiments have been employed for the 

optimization. The obtained optimal solution is verified again by finite element analysis. 

Index Terms—actuators, Braille screen, optimization, secondary models. 


Application of permanent magnets in the constructions 

of different actuators has been intensively increased in 

recent years. One of the reasons for their application is 

the possibility for development of energy efficient 

actuators. New constructions of permanent magnet 

actuators are employed for different purposes. One such 

purpose is the facilitation of perception of images by 

visually impaired people using the so called Braille 

screens. Recently, different approaches have been utilized 

for the actuators used to move Braille dots [1]-[6]. 

Typical view of a Braille screen is shown in Fig. 1. 

Figure 1: Braille screen with needles (dots) driven by 

linear actuators. 

In the present paper, recently developed permanent 

magnet linear actuator for driving a needle (dot) in Braille 

screen is optimized using response surface methodology 

(RSM) and design of experiments (DoE). 

The nature of the main application puts very firm 

requirements about the driver of the Braille screen 

needles. These requirements can be summarized as 

follows: 

- firm dimension constraints-especially in radial 

direction: outer diameter of the driver 3-6 mm; 

- holding force 02-05 N; 

- minimum energy consumption. 

The minimum energy consumption can be achieved by 

polarized construction of the driving electromagnet 

actuator because no power will be consumed at steady 

state. 

II. ACTUATOR CONSTRUCTION 

The principal actuator construction is shown in Fig. 2. 

The moving part is axially magnetized cylindrical 

permanent magnet with two ferromagnetic discs on both 

sides. 

The two coils are connected in series in such way that 

they create magnetic flux of opposite directions in the 

region of the permanent magnet. In this way, depending 

on the polarity of the power supply, the permanent 

magnet will move either up or down. When motion up is 

needed, the upper coil should create flux in the air gap 

coinciding with the flux of the permanent magnet. Lower 

coil at the same time will create opposite flux and the 

permanent magnet will move in upper direction. When 

motion down is needed, the polarity of the power supply 

is reversed. The motion is transferred to the Braille dot 

using the non-magnetic shaft. 

Figure 2: Principal construction of the studied actuator. 

1–upper shaft; 2–upper core; 3–outer core; 4–upper coil; 5-upper disc; 

6–magnet; 7–lower disc; 8–lower coil; 9–lower core; 10–lower shaft

The actuator features increased energy efficiency, as 

the power supply is needed only during the switching 

between the two end positions of the mover. In each end 

position, the permanent magnet creates holding force, 

which keeps the mover in this position. 

III. STATIC FORCE CHARACTERISTICS 

Static magnetic field of the actuator is modeled using 

the finite element method and the program FEMM [7]. 

Axisymmetric model is adopted as the actuator features 

rotational symmetry. The electromagnetic force acting on 

the mover is obtained using the weighted stress tensor 

approach. 

Typical static force characteristics of the actuator are 

shown in Fig. 3. The stroke of the actuator, denoted with 

x, is set to zero when the shaft is situated symmetrically 

between the upper and lower cores. 

c1=-1,c2=1 

c1=1,c2=-1 

1.2 

1 

F, N 

c1=0,c2=0 

0.8 

0.6 

0.4 

0.2 

0 

-0.6 -0.4 -0.2 -0.2 0 0.2 0.4 0.6 

-0.4 

-0.6 

-0.8 

-1 

-1.2 

x, mm 

Figure 3: Typical force-stroke characteristics of the 

studied construction. 

c1 and c2 show the direction of MMF in both coils of the construction. 

c1=-1, c2=1 – upward movement of the shaft; 

c1=1, c2=-1 –downward movement of the shaft; c1=0, c2=0 –non 

energized coils, the force is due to the permanent magnet only. 

The upper and lower curves in Fig. 3 represent the 

force when the shaft is moving in upward and downward 

direction. The middle curve shows the force when no 

current flows in the coils. In that case the force is due to 

the magnetic flux of the permanent magnet. The 

characteristic is symmetrical towards the origin of the 

force-stroke coordinate system and its final values (when 

the shaft is close to upper or lower cores) is called 

holding force – Fh. This is the only force that keeps the 

shaft in both stable position – upper and lower and it 

should resist to the force created by the touching fingers 

and the mover’s own weight. 

The starting force – Fs is the initial force that acts on 

the shaft when it is in its final upper position and both 

coils are energized in such a manner to create force in 

downward direction or the opposite – the shaft is in final 

lower position and force is acting upward. 

The construction should guarantee overcoming of the 

holding force, created by the permanent magnet, when 

the coils are properly energized. 

The upper coil excites in the upper core magnetic flux 

that is equal or bigger than the flux of the permanent 

magnet but contrary directed. At the same time, the flux 

excited by the lower coil is coincident with the flux of the 


permanent magnet. 

The construction minimizes the requirements towards 

the starting force and guarantees that it will start moving 

even for small value of the starting force if only it 

exceeds the own weight of the shaft. 

IV. SECONDARY MODELS 

Finite element method, DoE and RSM have been used 

for creation of the secondary models. Full factorial design 

has been applied. 

The fixed geometric parameters are shown in Fig. 4 and 

their values are given in Table 1. 

Figure 4. Fixed parameters of the actuator. 

TABLE I 

FIXED GEOMETRIC PARAMETERS 

Dimension 

Designation 

(in Fig. 4) 

Value 

(in mm) 

Outer core diameter D 5 

Outer magnet diameter Dm 2 

Inner coil diameter Dw1 2.4 

Outer coil diameter Dw2 4 

Shaft diameter Ds 1 

Inner core diameter Dc 1.2 

Core thickness hc 2 

The varied parameters are: 

- The length of the upper and lower cores - hw, 

- The axial dimension of the ferromagnetic disks - 

hd, 

- The length of the permanent magnet - hm, 

- The current density in the coils - J. 

The varied parameters with geometric representations 

are shown in Fig. 5.

Figure 5: Varied geometric parameters of the actuator. 

The DoE methodology has been used for varied 

parameters to create polynomial secondary models. For 

each combination of values of the varied parameters a 

family of static force-stroke characteristics was obtained. 

Based on them secondary models for holding force Fh, 

starting force Fs and ampere-turns of the coils – NI have 

been made. 

The precision of secondary models has been estimated 

by the relative error between the value obtained by te 

secondary model and corresponding value obtained by 

the FEM model. The difference between secondary and 

FEM models for 27 calculation points is given in Fig.6. 

relative error, % 

0.15 

0.1 

0.05 

0 

-0.05 

-0.1 

-0.15 

-0.2 

-0.25 

0 5 10 15 20 25 30 

number of calculation points 

Figure 6: Relative error between secondary and FEM 

models. 

V. OPTIMIZATION 

The objective function is minimal magnetomotive force 

of the coils. The optimization parameters are dimensions 

of the permanent magnet, ferromagnetic discs and the 

cores. As constraints, minimal electromagnetic force 

acting on the mover, minimal starting force and overall 

outer diameter of the actuator have been set. The 

Fs 

Fh 


optimization is carried out using sequential quadratic 

programming. 

The canonic form of the optimization problem is: 

 

 

 

 

 

 

 

 

 

where: 

- NI — ampere-turns — minimizing energy 

consumption with satisfied force requirements; 

- Fh — holding force — mover (shaft) in upper 

position, no current in the coils; 

- Fs — starting force — mover (shaft) in upper or 

lower position and energized coils; 

- J — coils current density; 

- hw, hm, hd—geometric dimensions according to 

the sketch in Fig. 6. 

Minimization of magneto-motive force NI is direct 

subsequence of the requirement for minimum energy 

consumption. 

Constraints for Fs and Fh have already been discussed. 

The lower bounds for the dimensions are imposed by the 

manufacturing limits and the upper bound for the current 

density is determined by the thermal balance of the 

actuator. 

The radial dimensions of the construction are directly 

dependent by the outer diameter of the core – D which 

fixed value was discussed earlier. The influence of those 

parameters on the behavior of the construction have been 

studied in previous work [8] that make clear that there is 

no need radial dimensions to be included in the set of 

optimization parameters. 

The optimization is carried out by sequential quadratic 

programming. The optimization results are as follows: 

 

 

 

 

 

The optimal parameters were set as input values to the 

FEM model. The force-stroke characteristics of the 

optimal actuator is shown in Fig.7 and Fig.8. 

Fh, N 

0.3 

0.2 

0.1 

0 

-0.1 

-0.2 

-0.3 

Fh - holding force in lower position of the shaft 

Fh - holding force in upper position of the shaft 

-0.4 -0.3 -0.2 -0.1 0 

x, mm 

0.1 0.2 0.3 0.4 

Figure 7: Force-stroke characteristic of the optimal 

actuator. The force is created by the permanent magnet 

only (no current in the coils).

F, N 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0 

F - final force (shaft imoved in lower position, coils still energized) 

Fs - starting force (shaft in upper position) 

Fh - force switches to the holding force when current is ceased 

-0.4 -0.3 -0.2 -0.1 0 

x, mm 

0.1 0.2 0.3 0.4 

Figure 8. Force-stroke characteristic of the optimal 

actuator. Coils are energized. The shaft is displaced from 

final upper to final lower position. 

In Figs. 9 and 10, the magnetic field of the optimal 

actuator is plotted for two cases. 

Figure 9: Magnetic field of the optimal actuator with 

shaft in upper position and coils energized to create 

downward force. 

Figure 10: Magnetic field of the optimal actuator with no 

current in the coils. 


The force constraints for Fs and Fh are active which 

can be expected when minimum energy consumption is 

required. The active constraint for hw is also expected 

because longer upper and lower cores size which 

respectively means longer coils will increase the leakage 

coil flux and corrupted coil efficiency. 


The employed approach has confirmed its robustness 

for solution to the optimization problem for the actuator. 

The obtained optimal solution satisfies the specific 

requirements for actuators for Braille screen. 

[1] 

REFERENCES 

Nobels T., F. Allemeersch, K. Hameyer, “Design of a high power 

density electromagnetic actuator for a portable Braille display.“ 

Int. Conf. EPE-PEMC 2002, Dubrovnik & Cavtat, 2002. 

[2] Kawaguchi Y., K. Ioi, Y. Ohtsubo, “Design of new Braille display 

using inverse principle of tuned mass damper.” Proc.of SICE 

annual conference 2010, Taipei, Taiwan, Aug. 18-21, pp. 379-383. 

[3] Kwon, H.J., Lee, S.W., Lee, S. Braille code display device with a 

PDMS membrane and thermopneumatic actuator. IEEE 

[4] 

international conference on micro electro mechanical systems 

(XXI), MEMS, Tucson, 2008, pp. 527-530. 

Chaves, D., Peixoto, I., Lima, A., Vieira, M., de Araujo, C. 

Microtuators of SMA for Braille display system. IEEE 

international workshop on medical measurements and 

[5] 

applications, MeMeA Cetraro, Italy, May 20-30, 2009, pp. 64-68. 

Hernandez, H., Preza, E., Velazquez, R. Characterization of a 

piezoelectric ultrasonic linear motor for braille displays. 

Electronics, robotics and automotive mechanics conference 

CERMA Cuernavaca, Mexico, Sep. 22-25, 2009, pp. 402-407. 

[6] Cho, H.C., Kim, B.S., Park, J.J., Song, J.B. (2006) Development 

of a Braille display using piezoelectric linear motors. 

[7] 

International joint conference SICE-ICASE, 2006, Busan, Korea, 

Oct. 18-21, pp. 1917-1921. 

D. Meeker, Finite element method magnetics version 3.4, 2005. 

[8] Yatchev I., K. Hinov, V. Gueorgiev, D. Karastoyanov, I. 

Balabozov, Force characteristics of an electromagnetic actuator 

for Braille screen, Proceedings of Thirteenth International 

Conference on Electrical Machines, Drives and Power Systems 

ELMA 2011, 21-22 October 2011, Varna, Bulgaria, pp. 338-341.


3D Finite Element Analysis of Induction Heating 

System for High Frequency Welding 

*Ilona I. Iatcheva, *Georgi H. Gigov , *Georgi C. Kunov and *Rumena D. Stancheva 

*Technical University of Sofia, Kliment Ohridski 8, Sofia 1000, Bulgaria 

E-mail: iiach@tu-sofia.bg 

Abstract—The aim of the work is investigation of induction heating system used for longitudinal, high frequency pipe 

welding. The problem was considered as 3D coupled electromagnetic and temperature field problem and has been solved 

using finite element method and COMSOL 4.2 software package. Time harmonic electromagnetic and transient thermal fields 

have been studied in order to estimate system efficiency and factors influencing on the quality of the welding process and 

required energy. 

Index Terms— finite element method, high frequency welding, 3D coupled field analysis. 

small scale, carbon steel tubes and pipes. It consists of 

spiral inductor, which induced a voltage across the edges 

of the moving open pipe material. The induced voltage 

causes high frequency currents, concentrated on the 

surface layer due to the skin and proximity effects. The 

currents flow along the two edges in opposite directions 

in so called “V”-zone (Fig.2) to the point where they 

meet, causing rapid heating of the metal and surface 

melting. The weld squeeze rolls are used to apply 

pressure, which forces the heated metal into contact and 

forms welding bond. 


The induction heating is widely used in the heat 

treatment of conducting details due to its advantages: 

high quality and efficiency of the heating processes, good 

accuracy in heating of certain zones in a short time and 

clean operating conditions [ 1]-[4]. 

The aim of the present research is investigation of 

induction heating system used for high frequency 

longitudinal pipe welding. The main task is to determine 

optimal factors and parameters influencing on quality of 

the welding process and required energy: welding 

frequency, welding speed, ‘vee’ angle, presence of the 

ferrite impeder (inner and outer), tube thickness and etc. 

The solution of the problem is based on the precise 3Dmodelling 

and FEM analysis of the electromagnetic and 

thermal processes, taking place in the investigated system. 

Detailed determination of the electromagnetic and 

temperature field distribution and its dependence on the 

mentioned above parameters is important condition for 

effective control and management of the welding process. 

II. INVESTIGATED INDUCTION HEATING SYSTEM 

The principal geometry of the investigated system is 

shown in Fig.1. 

squeeze 

point of roll 

closure 

direction of 

movement 

impeder 

core 

inductor 

steel pipe 

cooling water 

welded 

bond 

Figure 1: Geometry of the investigated induction system 

The system is designed for high frequency welding of 

Figure 2: In the “V”-zone HF currents flow along the two edges in 

opposite directions. 

The system includes also inner ferrite impeder, which 

concentrates magnetic flux and improves the welding 

efficiency. The cooling water flows inside the inductor 

and impeder for system cooling. 

As it can be seen from the geometry in Fig.1 the 

impeder is located not along the pipe axe, but moved 

closer to the welded region - i.e. the system is not 

axesymmetric and has to be analysed as three 

dimensional. 

The system has been investigated and electromagnetic 

and thermal processes have been modelled for the 

parameters shown in Table I. 

TABLE I 

PARAMETERS OF THE SYSTEM 

Parameter Value 

Applied current I 1000 A 

Voltage U 500 V 

cos 0,1 

Frequency f 200kHz 500kHz 

End heating temperature 13001450 0 C 

Cooling water 

temperature 

40 0 C

III. MATHEMATICAL MODEL OF THE COUPLED FIELD 

PROBLEM 

Mathematical modeling of the processes in the 

investigated system for high frequency welding are based 

on the analysis of coupled – electromagnetic and 

temperature field distribution in the considered device. 

As it has been already mention the geometry of the object 

is a complex, nonsymmetric and electromagnetic and 

thermal field have to be studied as three-dimensional. 

The present work deals with modeling of the 3D time 

harmonic electromagnetic field. The eddy current losses, 

obtained in electromagnetic field analysis are field 

sources in modeling of the transient thermal field 

The electromagnetic field problem has been studied not 

only in the system elements, but also in wide buffer zone 

around the devise. It helps to define correct boundary 

conditions in field modeling. In Fig.3 is shown 

investigated region, used in electromagnetic field 

modeling. It includes domains: 1- inductor; 2impeder; 

3- welded pipe; 4- cooling water; 5- buffer 

zone with air. 

1 

4 

Figure 3: Investigation domains 

2 

5 

3 

Electromagnetic field distribution can be described with 

equations (1) and (2): 

 

 

-1 

A 

 

( A) 

J e 

(1) 

t 

 

E j 

A V 

(2) 

where A is magnetic vector potential , J is current 

density, E is electrical strength , V is scalar electric 

potential, is electric conductivity and is magnetic 

permeability. 

The boundary conditions are A 0 

 

for the buffer zone 

boundaries. 

The time varying electromagnetic field produces eddy 

currents: 


 

J jA 

(3) 

and corresponding Joule losses – source of the heating in 

the region: 

 

1 

* 

[ ] JJ 

Q 

2 

 

(4) 

The transient thermal field is modeled by equation: 

T 

. C ( 

kT 

) Q (5) 

t 

where k is thermal conductivity , T is temperature, is 

density, C is heat capacity and Q is heat source, obtained 

in electromagnetic field analysis. 

IV. FEM ANALYSIS - 3D COUPLED PROBLEM 

Numerical simulation of the coupled - electromagnetic 

and thermal fields was carried out using FEM and 

COMSOL 4.2 package [4]. 

In Fig.4 is shown investigated system with the buffer 

zone around it and Fig.5 presents FEM mesh, used in 

solving the problem. 

Figure 4: Investigated system with the buffer zone around it. 

Figure 5: FEM mesh.

Some results, obtained in solving the problem for 

frequency 300 KHz are shown in Fig. 6, Fig. 7, Fig. 8, 

Fig. 9 and Fig. 10. 

The analysis of electromagnetic field distribution 

indicates that maximal value of the magnetic flux density 

is about 0.19T. These values are reached in the “V” zone 

and around the inductor. Two different cross sections 

illustrate distribution of the magnetic flux density in the 

system in Fig.6 and Fig.7. 

Figure 6: Distribution of magnetic flux density in the 

investigated region, f= 300 KHz. 

Figure 7: Distribution of magnetic flux density along the 

“V” zone, f= 300 KHz. 

The results, obtained for current density distribution in 

the entire region are shown in Fig.8. Two specific for the 

problem cross sections - around “point of closure” and 

spiral inductors are picking out. The maximal value is 


1,13x10 9 A/m 2 . Current density distribution around the 

“point of closure” is shown in Fig.9 and in Fig.10 around 

the spiral inductor. 

. 

Figure 8: Current density distribution in the entire region 

Figure 9: Current density distribution around “point of 

closure” 

V. CONCLUSION 

3D-coupled electromagnetic and temperature field 

problem and has been solved using finite element method 

and COMSOL 4.2 software package in order to 

investigate induction heating system for high frequency

pipe welding. The obtained temperature value around the 

“point of closure” is about 1400 0 C. 

REFERENCES 

[1] R.Baumer, Y.Adonyi ”Transient High-Frequency Welding 

[2] 

Simulations of Dual-Phase Steels”, Welding Journal, October 

2009, vol. 88, pp. 193 – 201 

D.Kim, T. Kim, Y.Park, K.Sung, M.Kang, C.Kim, I.Lee and 

S.Rhee, “Estimation of weld quality in high-frequency electric 

resistance welding”, Welding Journal, March 2007, pp. 27 – 31. 

[3] A. Shamov, I. Lunin, V. Ivanov, High frequency metal welding, 

[4] 

Leningrad, ‘Mashinostroenie”1977 (In Russian). 

COMSOL Version 4.2 User’s Guide, 2011. 



Optimization Algorithms in the View of State 

Space Concepts 

M. Neumayer∗ , D. Watzenig∗ , G. Steiner∗ , and B. Brandstätter † 

∗Institute of Electrical Measurement and Measurement Signal Processing, Graz University of Technology, 

Kopernikusgasse 24/4, A-8010 Graz, Austria, E-mail: neumayer@TUGraz.at 

† Elin Motoren GmbH, Elinmotorenstrasse 1, A-8160 Preding/Weiz, Austria 

Abstract—The working principles of optimization algorithms offer several characteristics which naturally arise in state 

estimation, or more generally when dealing with state space systems. In this paper we will treat similarities between the 

two disciplines and show how concepts of state estimation, including the incorporation of model uncertainty information, 

can be used in optimization. 

Index Terms—optimization, state space methods 


Numerical optimization is generally referred to as 

solving a problem of form [1] 

x ∗ = argminΨ(x) (1) 

s.t. C(x) ≤ 0, (2) 

where Ψ:R N → R 1 is called the objective function 

and the vector x ∈ R N contains the variables of interest. 

Possible constraints on the vector x are formulated by 

the vectorial function C(x) as a set of equalities and 

inequalities. By this the space of the feasible solutions 

becomes a subspace of R N . 

An enormous variety of algorithms and solution 

strategies for such problems has been the output of 

research activities in the past years. Yet it has to be 

mentioned that the presented form of the optimization 

problem is only a part of actual problems. I.e. the 

discipline of multi objective optimization asks for an 

optimal solution given several objective functions Ψi [2]. 

Such formulations are of importance in multi physical 

problem scenarios. Another important class are robust 

optimization approaches which aim for a stable solution 

under the scenario of uncertainty or tolerances in the 

objective function [3]. Thus, existing manufacturing 

tolerances can be incorporated to optimization based 

design process. A general distinction between the number 

of the different (iterative) optimization algorithms used 

today can be made by separating them into deterministic 

and stochastic methods. 

Deterministic methods most often make use of gradient 

and curvature information of the objective function Ψ 

in order to efficiently detect the minimum. Hereby efficiency 

is typically defined by the number of evaluations 

of Ψ. Classical first and second order deterministic methods 

try to minimize Ψ by determining a descent direction 

out of gradient or gradient and Hessian information. I.e. 

the classical steepest descent method uses the iteration 

xk+1 = xk − sg(xk), (3) 

to find x∗ in a step by step approach. Hereby 

g(xk) = ∇Ψ is defined as the gradient of Ψ with 

respect to the elements of xk. Due to this nature the 

result of deterministic methods can be strongly affected 

by the starting point x0. Also local minima of Ψ will 

result in a termination of the algorithm before the global 

minima is found. Stochastic methods rely on some 

sort of randomness to explore the parameter space in 

search for the minimum. A main difference with respect 

to deterministic methods is their ability to overcome 

local minima of the objective function. For stochastic 

optimization algorithms it has become common to let 

them run for a certain time or a number of evaluations of 

Ψ. Of course, hybrid algorithms have been proposed to 

combine the advantages of the two classes of algorithms. 

Although we have only pointed out the basics of some 

fundamental concepts of optimization we can observe, 

that in all algorithms some kind of evolution from the 

vector xk to the vector xk+1 occurs. Modern system 

theory uses so called state space models as a unified 

framework to describe dynamical systems [4]. The general 

form of a discrete time, nonlinear, time-variant state 

space model is given by 

xk+1 = F k(xk)+Bk(uk)+wk, (4) 

yk = Hk(xk)+vk, (5) 

where F k : R N → R N presents the system dynamics, 

Bk : R L → R N describes the affect onto xk+1 due 

to an input term uk ∈ R L , and Hk : R N → R M 

describes a measurement process. The terms wk ∈ R N 

and vk ∈ R M are referred to as process noise and 

measurement noise. We observe some similarities 

between the state space concept and the topics discussed 

in concern with optimization. Yet we have to say that 

state space methods and models follow a quite organized

scheme. 

In this paper we will point out similarities between optimization 

techniques and state space models and methods. 

The paper is structured as follows. In section II we 

give a short introduction about the state space concept. 

In section II we review optimization techniques in the 

sense of state space methods and present similarities 

as well as mathematical tools potential for a general 

description. Section IV lists several state space techniques 

which relate with topics of optimization and thus could 

potentially be used to improve optimization. Finally we 

present an exemplary hybrid optimization scheme which 

we derive from the state space view and demonstrate its 

behavior using some of the suggested approaches. 

II. THE STATE SPACE CONCEPT IN MORE DETAIL 

uk Bk 

wk 

xk+1 

z −1 

F k 

Hk 

Fig. 1. Diagram of the state space model given by equation (4) and (5). 

Figure 1 depicts the structure of a state space model 

given by equations (4) and (5). The core purpose of a 

state space model is to describe the evolution of the 

state vector x over the time or the discrete time steps, 

respectively. As can be observed by the equations (4) 

and (5) or by figure 1, this evolution is determined 

by a deterministic drift due to the dynamics of F k 

and the input uk and a stochastic diffusion due to 

the process noise wk. The system is referred to be an 

autonomous system if B is zero. The function Hk 

provides a deterministic measure about the internal state. 

In addition the measurement noise vk acts as an additive 

disturber. We can already observe that the state space 

concept is able to provides several aspects which we 

pointed out in the introductional part about optimization 

algorithms in a natural way. 

It should be mentioned that a state space model is 

referred to be linear if all system components are matrices. 

This class is of large importance as many technical 

processes can be described by this. 

A. State Space Methods 

In this part of section II we want to give a 

short introduction about two important disciplines in 

association with state space models. These are state 

estimation and state control. 

State estimation is referred to the task to find an 

estimate ˆxk of the state xk using the measurements y k , 

the input uk and the model. 

vk 

y k 


State control is a special kind of feedback control 

where the system input uk is formed as a function of 

the state vector xk. A notable controller out of this class 

is the dead beat control system. This approach enables 

a control system to reach the steady state within a finite 

number of iterations. 

III. OPTIMIZATION AND STATE SPACE METHODS 

Looking onto all points discussed so far we can consider 

a relation between the measurement function H k 

and the objective function Ψ. I.e. for a design problem 

where one is interested to meet a desired output yd, Ψ 

could be of form 

Ψ(x) =(H k(x) − y d ) T W (Hk(x) − y d ) . (6) 

Hereby the positive definite matrix W presents a weighting 

matrix. From a system theoretic point of view we 

could consider the function Ψ as a (nonlinear) control 

plant of MISO (multiple input single output) type. 

A. Classical Deterministic Methods Reviewed 

Classical deterministic optimization methods like the 

already mentioned steepest descent method (see equation 

(3)) take use of local gradient or curvature information 

of the function Ψ. While the steepest descent algorithm 

just takes use of the gradient information the well known 

Gauss-Newton (GN) method defined by 

xn+1 = xn + sG −1 

k gk, (7) 

takes use of the Hessian G matrix which provides curvature 

information about Ψ to improve the convergence 

behavior. In both schemes, the steepest descent method 

and the GN method, the system matrix F isgivenbythe 

identity matrix I. For objective functions Ψ of form (6), 

the practical realization of the GN method is given by 

xn+1 = xn − s(JJ T ) −1 Jr (8) 

where J is the Jacobian of the system H with respect 

to the state vector x. Herebyr =(y − y d) defines the 

residual vector of the output of F with respect to y d . 

The gradient g = ∇xΨ of the objective function (6) with 

respect to x (to keep the notation short we set the matrix 

W to be the identity matrix I) isgivenasg = J(y−y d) 

and (JJ T ) approximates the Hessian G [1]. 

 

JJ T −1 

z −1 

I 

−sJ 

H k 

Fig. 2. State space representation of a second order scheme. 

y k 

−y d 

Figure 2 depicts the GN scheme as a control system for 

the objective function (plant) Ψ following equation (8). 

For the steepest descent method the matrix B is replaced 

by the identity matrix I. Note, that all matrices depend

on the iteration index k. A control system with this 

property is referred to as a time varying control system. 

We observe that neither the steepest descent algorithm 

nor the GN method are state space control systems as 

these methods do not take use of the state vector x. 

However, we can observe a closed loop scheme in figure 

2. It is hard to argue whether we see the steepest descent 

algorithm as a drive system or as a closed loop control 

system as for B = −g and s replacing the input u 

(scalar) no closed loop is required. However, with respect 

to the different input matrix B the powerfulness of the 

GN-method becomes clear from a system theoretic point 

of view. 

B. Stochastic Methods Reviewed 

With the availability of more and more computational 

power stochastic optimization methods have become 

of increased interest for many practical problems. 

Interesting issues for the application of stochastic 

methods is their ability to overcome local minima, and 

the not given necessity for derivative information. This 

is of concern for not differentiable or not continuous 

problems 

In contrast to deterministic methods, stochastic 

methods most often rely on a set of N individual vectors 

x N which explore the objective function on their own. 

Over the time a mutual exchange of information from 

the the different realizations x N is performed which 

mixes the individuals. Concepts about the individual 

exploration of each individual on Ψ as well as the 

exchange of mutual information between the individuals 

is often based on concepts of nature like evolution 

principles resulting in the class of genetic algorithms 

(GA). I.e. certain elements of two arbitrarily selected 

vectors xi and xj are exchanged, replaced by a weighted 

mean or just individually disturbed by a random variable. 

A contrastable aspect with respect to the behavior of 

deterministic methods is the fact, that the combining 

principles do not automatically remove the weakest 

individual (the realization with the highest value of Ψ). 

Instead also the strongest individual can be removed by 

some random procedure. Exactly this property enables 

the behavior that stochastic methods can overcome local 

minima. Other well known strategies for stochastic 

optimization are particle swarm optimization (PSO), 

nitching evolution techniques or differential evolution 

(DE) [5]. Also the behavior of an ant colony or bacteria 

in a nutrient solution [6] have been used as strategies to 

find a solution minimizing Ψ. 

The enormous variety of differently labeled stochastic 

algorithms [7] makes it often hard to distinguish the 

differences between. More important it is hard to 

charge the efficiency of the different methods and their 

suitability for different applications. In the following we 

will provide an approach to present several aspects of 


stochastic optimization within the unified framework of 

state space techniques. 

In state space models randomness has the unified 

entrance into the system formulated by the process noise 

w. By setting the deterministic input vector u to zero the 

resulting system becomes an autonomous system. While 

different stochastic optimization strategies are originated 

by more or less random inspiritments, system theory 

takes use of probabilistic methods to describe the behavior 

in concern with randomness. Hereby any random 

process is described by a probability density function 

(pdf) denoted by π(·). The mathematical framework used 

to describe stochastic behavior is based on Bayes law 

π(x|y) = π(y|x)π(x) 

∝ π(y|x)π(x), (9) 

π(y) 

where π(y|x) is referred to as the likelihood function 

and π(x) is referred to as prior. The evidence π(y) has 

the role of a normalization constant and can be skipped 

leading to the right hand side formula of the posterior 

distribution π(x|y). The likelihood function provides a 

probability measure for x originating a certain output y. 

The prior π(x) gives a probability statement about x. 

We can already link this concepts to the optimization 

problem given by equation (1) and the constraints 

formulated in equation (2), as the likelihood function 

obviously is able to express C(x) by becoming zero for 

infeasible solutions. However, this concept also enables 

the possibility of a continuous measure for the state x, 

i.e. we can incorporate ”gray regions” for the solution. 

The understanding of the likelihood is maybe not that 

obvious. For easier explanation we write the likelihood 

corresponding to the objective function (6) as 

 

π(yd |x) ∝ exp − (Hk(x) − yd ) T 

W (H k(x) − yd ) . 

(10) 

The likelihood function is the exponential of the negative 

objective function but it states Ψ as a probability measure. 

It has to be noted that 0 < 

N 

exp(−Ψ(x))dx < ∞ 

has to hold in the Lebesgue sense, to form a likelihood 

function from an objective function. Such a formulation 

is known from simulated annealing (SA). Hereby a 

stochastic algorithm seeks for the modes (maxima) of 

the function exp(−Ψ(x)/T ), where T is an artificial 

temperature which decreases over time. Note, that due 

to the temperature T , SA is different with respect to 

Bayesian inference as the likelihood has a physical 

meaning where no term like T occurs. Given all these 

aspects stochastic optimization can be fully seen in the 

context of state estimation and we can work out some 

conceptual ideas that are used in state estimation in the 

next section. 

The exchange of mutual information depends on a so 

called resampling scheme which stays outside the state 

space model. While different stochastic methods have 

brought up a variety of exchange schemes also state

estimation methods have brought up unified methods like 

residual, stratified, or systematic resampling, etc. [8]. We 

will not focus on these aspects of stochastic optimization 

methods at this point, but we will provide a description 

about the stochastic diffusion of states in the state space 

view of optimization. 

While deterministic methods select the state update 

from gradient or curvature information in order to decrease 

Ψ and thus follow strictly deterministic rules, 

the probabilistic change is summarized by means of 

pdf’s π(·). State space theorists have developed the 

ChapmanKolmogorov equation 

 

π(xk|yd )= π(xk|xk−1)π(xk−1|yd )dxk−1, (11) 

R N 

to provide a probabilistic measure about the state evolution 

given the current state and its posterior. While 

equation (11) is not hard to derive using the mathematical 

tool of marginalization, it provides two interesting insides 

about the update in stochastic optimization methods. 

• The state update is described by π(xk|xk−1) and 

does not depend on the current value of the objective 

function. 

• The update probability depends on π(xk−1|y d ),but 

there is no guarantee that xk−1 will be changed. 

The transition kernel π(xk|xk−1) describes the probability 

of the state exchange from state xk−1 to the state xk. 

A remarkable point about this formulation is the fact, that 

the update is independent from the current value of Ψ or 

the posterior. This is an important fact that explains the 

powerfulness of stochastic methods. If the kernel would 

depend on Ψ, stochastic methods would end with the 

same stalling behavior in local minima as deterministic 

methods do, as then a deterministic drift is present. 

In most cases the kernel π(xk|xk−1) is even reduced 

to π(xk). The pdf π(xk−1|y d) in equation (11) induces 

another important principle in stochastic optimization 

which can be directly connected to the mutual information 

exchange. It states, that the update of the state due 

to the proposal kernel is not guaranteed. Instead we can 

see the result π(xk|y d) only provides a relative number 

for the new state π(xk) to be accepted. 

IV. STATE SPACE METHODS FOR OPTIMIZATION 

In this section we want to discuss some more state 

space concepts and their use for stochastic optimization. 

We have selected these methods as we see them 

to be important with nowadays needs. State estimation 

techniques are among the algorithms which have seen 

one of the strongest developments in the past decades. 

The early origin was given by the Apollo space flight 

programm in the 1960’s where the Kalman filter has 

seen it’s breakthrough. Since then both, single point 

and population-based methods have been developed, to 

regain knowledge from the hidden states of a system 

given the actually observed function values for an optimal 

designed objective function in order to recover x from 


noisy observations. In this sense we first have to discuss 

the meaning of the likelihood function π(x|y d) in more 

detail. Following the definition of a multivariate Gaussian 

random variable y 

y ∝ exp −(y − μ) T Σ −1 (y − μ) , (12) 

where μ expresses the mean and Σ is the covariance 

matrix, we observe, that the likelihood function has the 

mean of a Gaussian distribution expressing uncertainty 

about y. In this sense the measurement noise v becomes 

relevant for a first as the likelihood function expressed 

this noise in terms of a probability measure. This will 

lead us directly to the aspects brought in the following 

subsection. 

The consequent use of this approach brought up powerful 

stochastic state estimation algorithms like state observers, 

sequential Monte Carlo methods like the already 

mentioned Kalman filter or Particle filters, or even more 

powerful Markov chain Monte Carlo (MCMC) methods. 

A. Enhanced Error Model 

A matter of concern with the solution of physical 

motivated optimization problems are the computational 

costs in concern with the evaluation of the objective 

function Ψ. This especially holds if the underlying problem 

requires the solution of partial differential equations 

(PDE’s) which has to be done by numerical methods 

like the finite element method (FEM). Recently the 

use of approximation techniques has become popular in 

both, state estimation and optimization [9], [10]. Hereby 

the computational costly evaluation of H k is replaces 

by a cheap approximation or surrogate function H ∗ k. 

Subsequently this leads to the cost function Ψ∗ due to 

the approximation error 

e = H ∗ k − H k. (13) 

We can reformulate the relation between H k and H ∗ k to 

H ∗ k = Hk +(H ∗ k − H k) =Hk + e. (14) 

This is an interesting formulation as we can look on 

the approximation error e as an additive error similar to 

the measurement noise v depicted in figure 1. Although 

the approximation error e depends on the state x, and 

thus is a deterministic error, we can think about a 

probabilistic description about e in the concept of a 

Gaussian distribution. This is an approach often taken 

in several fields of state estimation and system theory. 

It ends up exactly in the idea covered by the so called 

enhanced error model [11]. Although the approximation 

error e is of deterministic nature a probabilistic model is 

built from samples about the state space R N . Then the 

likelihood function π ∗ (y d|x) becomes 

π ∗ (y d|x) ∝ exp −(y ∗ − y d + μ e) T Σ −1 

e (y ∗ − y d + μ e) , 

(15) 

and the optimization can be performed on this 

computational less costly function. Given the degree

of accuracy of the approximation H ∗ k the solution can 

be seen as good as a solution obtained by Hk, orthe 

approximation approach can be used to find a good 

initial solution which can be refined in less optimization 

steps using the accurate model. 

In general the determination of the mean μ e and the 

covariance matrix Σe requires a large number of samples. 

However, during the setup and model testing phase for the 

optimization problem typically enough data is generated 

to describe e in the presented way. 

B. Hybrid Schemes 

Another useful aspect about the use of state space 

schemes for optimization is the natural possibility to incorporate 

both, deterministic and stochastic methods for 

building hybrid optimization schemes. This can be easily 

done by enabling the input vector uk and building an 

outer feedback system as discussed in subsection III-A. 

The natural representation of the interaction between 

the deterministic drift and the stochastic interaction is 

therefore of interest, as it illustrates the powerfulness 

of the combination. I.e. if only some elements of the 

gradient g are available because the function Ψ is not 

steady with respect to this variables, we can only use 

the available gradient information for the input vector 

u. The other components of x are updated by the 

stochastic algorithm. In addition, an outer resampling 

scheme retains the property of a stochastic optimization 

scheme to overcome local minima. 

C. Robust Schemes 

Uncertainty is in many aspects a concerning topic in 

state estimation. This is given due to the fact, that models 

often do not cover all physical aspects due to reduction. 

Also optimization engineers have developed robust target 

functions in order to find solutions insensitive with 

respect to parameter variations of x [3]. Such robust 

objective functions are typically of form 

min max Ψ(x, ξ), (16) 

x ξ 

where ξ describes an immanent given uncertainty in the 

parameters. Most often the absolute value of ξ is limited. 

Robust state estimation has brought up the H∞ concept 

[4], where the estimation error e = x − ˆx is minimized 

using an approach of form 

min max J (x, ˆx, v, w). (17) 

ˆx v,w 

Hereby no limitations about the process noise w and 

the measurement noise v are assumed. The H∞ filter 

seeks for the best estimate under worst case conditions. 

Mostly game theoretic approaches are used to formulate 

the function J . We pointed this out, as control scientist 

have gained a lot of experience in the field and there 

might be useful aspects for optimization. 


V. A NUMERICAL EXAMPLE 

To provide a numerical example about the presented 

considerations of state space methods for optimization 

we want to present a simple optimization problem 

consisting of an inverse problem for a resistor network 

example. Figure 3(a) depicts the resistor network under 

investigation. The black lines illustrate resistors with 

a value of R1 = 1Ω. The gray colored lines mark a 

circular disk of radius r where resistors with a value of 

R2 are placed. Hereby the mapping between the circle 

radius and the resistor values is discontinuous by the 

way, that the resistor has to be fully placed inside the 

circle. It is now aim to find the radius r of the circle 

and the resistor value R2 from some electrical boundary 

measurements. These measurement built the vector yd. A problem of this kind is a classical inverse problem 

where we aim on the determination of the state vector 

x = T r R2 from measurements yd. 

Figure 3 exemplary depicts a part of the cost function. 

Hereby the R2 and r were set to R2 =0.5Ω and r = 

0.5m. The corner length was set to r =1mand was 

discretized by 40 resistors. A current is injected at the 

upper left corner and 5 equidistant measurement points 

(ampere meters) are connected to the lower edge. 

(a) Resistor network. 

0.7 

0.6 

0 

x 10 

8 

7 

6 

5 

4 

3 

2 

1 

−4 

Fig. 3. Test example and objective function. 

Ψ 

0.5 

0.4 

r (m) 

0.55 

0.5 

0.45 

0.3 

0.2 0.4 

R (Ω) 

2 

(b) Objective function Ψ. 

We now want to apply a hybrid optimization approach 

where the corner points about the algorithm can be stated 

by the following: 

• We use a population based scheme. 

• We use gradient information about the resistor 

value R2. 

• We work on a reduced model (only half the number 

of resistor elements per edge). 

In state estimation such an algorithm belongs to the class 

of sequential Monte Carlo (SMC) methods and is mostly 

referred to as Particle filter (PF) [12]. 

Arguable one of the most interesting points in this list 

is the use of a reduced model to solve the optimization 

problem. Figure 4(a) depicts the objective function (6) 

(W was set to be the identity matrix) when using the 

reduced model for solving the optimization problem with 

data from the fine model. One can obtain, that the depicted 

part of the objective function does not even include 

a minima. Figure 4(b) depicts the likelihood of form (15), 

using an enhanced error model. As we can see, the point

where the likelihood function has its maxima presents 

the true solution. Thus, if our optimization algorithm is 

designed to minimize the corresponding objective function 

is should be possible to find the solution although 

working on the reduced model. Figure 5 depicts the 

Ψ * 

0.2 

0.15 

0.1 

0.05 

0 

0.7 

0.6 

0.5 

0.4 

0.3 

r (m) 

0.7 

0.6 

0.5 

0.4 

0.3 

R (Ω) 

2 

(a) Objective function Ψ ∗ . 

π(r,R 2 ) 

1 

0.8 

0.6 

0.4 

0.2 

0 

0.7 

0.6 

0.5 

0.4 

0.3 

r (m) 

0.3 

0.4 

R 2 (Ω) 

(b) Posteriori probability π ∗ (r, R2). 

Fig. 4. Determination of r and R2 using a reduced model. 

behavior and the result of the proposed hybrid scheme 

for the given problem using the reduced model for the 

solution. Figure 5(a) depicts the state of the population. 

As can be seen, the population is clustered around the 

correct solution. Hereby the background color depicts 

the objective function for the fine model but as stated 

the coarse model is used! Figure 5(b) and figure 5(c) 

depict the decrease of the objective function and the 

increase of the likelihood function, respectively. The dots 

illustread the spread of the population. As can be seen 

both, the likelihood function and the objective function 

can become smaller or larger, respectively. Thus, the 

property of stochastic methods is given. 

Ψ(r,R 2 ) 

0.035 

0.03 

0.025 

0.02 

0.015 

0.01 

0.005 

r (m) 

0.7 

0.65 

0.6 

0.55 

0.5 

0.45 

0.4 

0.35 

0 

1 2 3 4 5 6 

Iteration 

7 8 9 10 

(b) Objective function. 

0.3 

0.3 0.4 0.5 0.6 0.7 

R (Ω) 

2 

(a) Particles. 

Fig. 5. Output of the particle filter. 

π(r,R 2 ) 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0 

1 2 3 4 5 6 

Iteration 

7 8 9 10 

(c) Posteriori probability. 

VI. OUTLOOK 

In this paper we demonstrated a state space view on 

optimization algorithms. Both, deterministic and stochastic 

methods were exploited in the content of the unified 

state space representation. We demonstrated that 

0.5 

0.6 

0.7 


deterministic approaches can be considered as standard 

feedback systems, whereas stochastic methods can be 

directly linked to state estimation. We explored features 

of stochastic state estimation using Bayes law and subsequently 

demonstrated the usefulness of state estimation 

techniques for optimization. All our considerations are 

summarized in a hybrid optimization algorithm working 

on a reduced model where we demonstrated the natural 

interaction of deterministic and stochastic methods using 

state space descriptions. Further research will focus in 

two directions. First, we would extend the presented 

hybrid scheme using some more sophisticated methods. 

Second we consider work on a formal description of 

different stochastic algorithms using methods from probability 

theory. 

REFERENCES 

[1] R. Fletcher, Practical Methods of Optimization; (2nd Ed.), Wiley- 

Interscience, New York, USA, 1987. 

[2] L. dos Santos Coelho and P. Alotto, Multiobjective Electromagnetic 

Optimization Based on a Nondominated Sorting Genetic Approach 

With a Chaotic Crossover Operator, IEEE Transactions on Magnetics, 

vol.44, no.6, pp.1078-1081, 2008. 

[3] P. Alotto, C. Magele, W. Renhart, A. Weber, G. Steiner Robust 

target functions in electromagnetic design, COMPEL: The International 

Journal for Computation and Mathematics in Electrical 

and Electronic Engineering, Vol. 22 Iss: 3, pp.549 - 560, 2003. 

[4] D. Simon, Optimal state estimation, Kalman, H∞ and nonlinear 

approaches, Wiley - Interscience, John Wiley & Sons, Inc., New 

Jersey, 2006. 

[5] R. Storn and K. Price, Differential evolution - a simple and efficient 

heuristic for global optimization over continuous spaces, Journal 

of Global Optimization 11: pp.341-359, 1997. 

[6] L. dos Santos Coelho, C. da Costa Silveira, C.A. Sierakowski, and 

P. Alotto, Improved Bacterial Foraging Strategy Applied to TEAM 

Workshop Benchmark Problem, IEEE Transactions on Magnetics, 

vol.46, no.8, pp.2903-2906, Aug. 2010. 

[7] O. Hajji, S. Brisset, and P. Brochet, Comparing stochastic optimization 

methods used in electrical engineering, Systems, Man 

and Cybernetics, 2002 IEEE International Conference on , vol.7, 

no., pp. 6 pp. vol.7, 6-9 Oct. 2002. 

[8] R. Douc, O. Cappe, and E. MoulinesComparison of resampling 

schemes for particle filtering, In 4th International Symposium on 

Image and Signal Processing and Analysis (ISPA), pp.64-69, 2005. 

[9] Albunni, M.N.; Rischmuller, V.; Fritzsche, T.; Lohmann, B.; , 

Multiobjective Optimization of the Design of Nonlinear Electromagnetic 

Systems Using Parametric Reduced Order Models, IEEE 

Transactions on Magnetics, vol.45, no.3, pp.1474-1477, March 

2009. 

[10] A. I. Forrester, A. Sóbester and A. J. Keane, Engineering Design 

via Surrogate Modelling A Practical Guide, Wiley, 2008. 

[11] J. P. Kaipio and E. Somersalo, Statistical and computational 

inverse problems, New York: Applied Mathematical Sciences, 

Springer, 2004. 

[12] M.S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, 

A tutorial on particle filters for online nonlinear/non-Gaussian 

Bayesian tracking IEEE Transactions on Signal Processing 50 (2), 

pp.174188, 2002.


Quasi TEM Analysis of 2D Symmetrically Coupled 

Strip Lines with Finite Grounded Plane using HBEM 

*Saša S. Ilić, *Mirjana T. Perić, *Slavoljub R. Aleksić and *Nebojša B. Raičević 

*University of Niš, Faculty of Electronic Engineering of Niš, Aleksandra Medvedeva 14, 18000 Niš, Serbia 

E-mail: sasa.ilic@elfak.ni.ac.rs 

Abstract—The hybrid boundary element method (HBEM), based on combination between equivalent electrodes method 

(EEM) and boundary element method (BEM), is applied for characteristic parameters determination of symmetrically coupled 

strip lines with a finite width grounded plane. Even and odd modes are considered in the paper. All results for the characteristic 

impedance and the effective dielectric permittivity are compared with the finite element method (FEM). 

Index Terms—Characteristic impedance, Equivalent Electrodes Method (EEM), Finite Element Method (FEM), Hybrid 

Boundary Element Method (HBEM). 

strip lines parameters, when the strip line is above an infinite-width 

grounded plane [14]. The HBEM can be also 

applied to analysis of corona effects [15] and metamaterial 

structures [16]. A problem of symmetrically 

coupled strip line placed above infinite grounded plane is 

investigated in [17]. 

The HBEM is applied, in this paper, to calculate the 

even- and odd- mode characteristic impedance of 2D 

symmetrically coupled strip lines with finite grounded 

plane, shown in Fig. 1. The quasi TEM analysis is used in 

this paper. 


Over the years, many authors have analyzed 

symmetrically and asymmetrically coupled or ordinary 

strip lines with width-limited dielectric substrate using 

numerous numerical and analytical methods [1]-[9]. The 

variational method [1], the Garlekin’s method, the method 

of moments [2]-[4], the boundary element method [5], the 

conformal mapping, the moving perfect electric wall 

method [6]-[9] etc. are some of the commonly used 

methods. On the other side, the problem of the widthlimited 

microstrip grounded plane has not been so often 

investigated, although these forms of microstrips are 

typical in practice. In [7]-[9] the microstrip line with 

finite-width dielectric and grounded plane was analyzed. 

A so-called moving perfect electric wall method (MPEW) 

in conjunction with the conformal mapping method 

(CMM) was applied in those papers. 

An application of boundary element method (BEM) 

usually contains singular and nearly singular integrals 

whose evaluation is difficult although original problems 

are not singular. In order to avoid numerical integrations, 

it is possible to substitute small boundary segments by 

total charges placed at their centres. The Green’s function 

for the electric scalar potential of the charges, placed in 

the free space at the boundary of two dielectrics, is used 

and the proposed method is called the hybrid boundary 

element method (HBEM) [10-17]. 

This method presents a combination of BEM and 

equivalent electrodes method (EEM). The basic idea is in 

replacing an arbitrary shaped electrode by equivalent 

electrodes (EEs), and an arbitrary shaped boundary 

surface between any two dielectric layers by discrete 

equivalent total charges per unit length placed in the air. 

The basic Green’s function for the electric scalar potential 

of the charges placed in the free space at the boundary 

surface of two dielectrics is used. The method is based on 

the EEM, on the point-matching method (PMM) for the 

potential of the perfect electric conductor (PEC) 

electrodes and for the normal component of the electric 

field at the boundary surface between any two dielectric 

layers. 

The HBEM is applied, until now, to solving 

multilayered electromagnetic problems [10], grounding 

systems [12], electromagnetic field determination in vicinity 

of cable terminations [13], as well as to calculation of 

Figure 1: Symmetrically coupled strip line with finite grounded plane. 

Symmetrically coupled strip lines can be used as basic 

elements for filters, phase shifters, directional couplers, 

baluns and combiners [18]. 

II. THEORETICAL BACKGROUND 

The HBEM is applied and corresponding model is formed, 

Fig. 2. 

Figure 2: Corresponding HBEM model.

Using the existing symmetry, the electric scalar potential 

of whole system from Fig. 2 is determined: 

(e (e, o) 

2 

B ln l 

4 

B ln l 

3 

B ln l 

Ki 

A 

i 1 k 1 

( x 

Ki 

i 3 k 1 

( x 

Mi 

i 1 m 1 

( x 

q 

0 

d 

ik 

2 

q 

2 

2 

dik 

a 

ik 

q 

x 

x 

0 

a ik 

t 

im 

x 

0 

t im 

) 

) 

ln 

) 

2 

2 

2 

ln 

ln 

( x 

( y 

( x 

( y 

( y 

( x 

x 

y 

dik 

y 

y 

dik 

x 

a ik 

) 

t im 

) 

a ik 

x 

) 

) 

2 

2 

) 

2 

t im 

) 

2 

2 

2 

, 

( y 

( y 

( y 

y 

dik 

y 

) 

a ik 

y 

2 

) 

t im 

where the coefficients A and B have following values: 

0, 

odd(o) 

mode; 

A 

1, 

even(e) 

mode. 

B 

1, 

odd(o) 

mode; 

1, 

even(e) 

mode . 

The electric field is E grad( g ( ) . The total number of 

unknowns N tot , will be denoted by: 

4 

3 

N K M A . 

tot 

i 

i 1 i 1 

A relation between the normal component of the 

electric field and the total surface charges is given with 

Eq. (2): 

n ˆi ( 0 

Eim 

) 

0 

( 0 

) 

t 

im i , t im i 

q 

t im i 

, 

l im 

(2) 

where i M m , , 1 , 3 , 2 , i 1 , nˆ i ( nˆ 1 nnˆ 

ˆ2 

yyˆ 

ˆ , nˆ 3 xxˆ 

ˆ ) 

are unit normal vectors oriented from the layer 

the layer 0 . 

towards 

Using the PMM for the potential of the perfect 

conductors given by (1), the PMM for the normal 

component of the electric field (2), and the electrical 

neutrality condition (3) (only for the even mode!), it is 

possible to determine unknown free charges per unit 

length on conductors, the total charges per unit length on 

the boundary surfaces between two dielectric layers and 

the unknown constant 0 . 

The electrical neutrality condition is: 

2 

Ki 

4 Ki 

q d dik 

qa 

a ik 0 0. 

(3) 

i 1 k 1 i 3 k 1 

After solving the system of linear equations, it is 

possible to calculate the capacitance per unit length of the 

i 

2 

) 

2 

(1) 


strip line given by (4): 

K1 

K3 

(e, o) 1 

C qd1k 

qa 

3k 

. (4) 

U 

k 1 k 1 

With the developed program code, the characteristic 

impedance of the symmetrically coupled strip line is 

calculated as 

(e, o) (e, o) eff ef (e, o) 

Z c Zc 

0 / r , 

where 

ef eff (e, o) ( (e, o) ( (e, o) 

r 

C 

/C 0 

(e, o) 

is the effective dielectric permittivity, and Z c0 

is the 

characteristic impedance of the symmetrically coupled 

strip line without dielectric layer (free space), for even (e) 

and odd (o) modes, respectively. 

In order to verify the obtained numerical results for the 

characteristic impedance and the effective dielectric 

permittivity, the finite element method (FEM) [19] is 

used. 

III. RESULTS 

The results convergence and computation time for the 

even and odd modes can be noticed from Table I, for 

parameters: r 3 , d / w1 

4 , h / d 0 0. 

5 , t 1 / w1 

0 

. 1 , 

s / w1 

1 

. 0 , w 2 / w1 

6 6. 

0 and t 2 / t1 

2 

. 0 , where N tot 

is the total number of unknowns. 

N tot 

TABLE I 

CONVERGENCE OF RESULTS AND CPU TIME 

Even mode Odd mode 

eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] 

t(s) 

298 2.119 158.860 1.817 77.293 4.4 

370 2.120 158.871 1.823 77.207 6.9 

444 2.121 158.901 1.827 77.162 9.7 

585 2.123 158.906 1.832 77.093 16.7 

655 2.123 158.917 1.833 77.077 20.9 

726 2.124 158.905 1.835 77.048 25.7 

800 2.124 158.916 1.836 77.039 31.5 

872 2.124 158.916 1.837 77.026 38.8 

940 2.125 158.905 1.838 77.007 45.4 

1014 2.125 158.914 1.839 77.003 51.4 

1085 2.125 158.904 1.840 76.987 58.1 

1155 2.125 158.911 1.840 76.986 65.4 

1225 2.125 158.917 1.840 76.984 74.1 

1296 2.126 158.909 1.841 76.973 86.5 

1370 2.125 158.915 1.841 76.972 97.6 

First, a very good convergence of values of both 

parameters is achieved for the both modes. Second, a 

computation time was much shorter comparing to the time 

required by FEM: we needed up to 97.6 seconds for the 

system of 1370 unknowns, while FEM for solving the 

same problem took about 15 minutes with a few hundreds 

of thousands of finite elements. 

Equipotential contours and distributions of polarized

charges per unit length along boundary surface are shown 

in Figs. 3-6 (even and odd modes, respectively) for para- 

meters: 

r 3 , d / w1 

4 , h / d 0 0. 

5, 

t 1 / w1 

0 0. 

1, 

s / w1 

1 

. 0 , w 2 / w1 

6 

. 0 and t 2 / t1 

2 

. 0 . 

Figure 3: Equipotential contours (Even mode). 

Figure 4: Equipotential contours (Odd mode). 

t 2 

t1 

1 

2 

3 

4 


Figure 5: Distribution of polarized charges per unit length along 

boundary surface (Even mode). 

Figure 6: Distribution of polarized charges per unit length along 

boundary surface (Odd mode). 

TABLE II 

COMPARED RESULTS FOR CHARACTERISTIC IMPEDANCE OF STRIP LINE VERSUS 2 1 t t AND h d FOR PARAMETERS: 

r 3 , d / w1 

4 , t 1/ 

w1 

0 0. 

05 , s / w1 

1 

. 0 AND w 2 / w1 

6 6. 

0 . 

h 

d 


HBEM FEM HBEM FEM 

eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] 

0.2 2.3967 84.950 2.3969 84.864 2.0470 64.218 2.0558 63.942 

0.4 2.2182 138.000 2.2185 137.829 1.9009 77.198 1.9120 76.805 

0.6 2.0856 182.116 2.0862 181.851 1.8642 81.055 1.8757 80.626 

0.8 1.9849 220.428 1.9863 220.066 1.8550 82.450 1.8676 81.983 

1.0 1.9059 254.371 1.9076 253.892 1.8533 83.020 1.8666 82.318 

0.2 2.3938 84.935 2.3965 84.783 2.0470 64.215 2.0555 63.948 

0.4 2.2144 137.901 2.2172 137.624 1.9008 77.192 1.9119 76.801 

0.6 2.0817 181.901 2.0844 181.526 1.8642 81.048 1.8757 80.618 

0.8 1.9810 220.107 1.9844 219.602 1.8543 82.457 1.8676 81.981 

1.0 1.9021 253.950 1.9056 253.368 1.8533 83.016 1.8670 82.545 

0.2 2.3923 84.907 2.3957 84.741 2.0469 64.213 2.0553 63.949 

0.4 2.2122 137.786 2.2156 137.492 1.9008 77.188 1.9119 76.794 

0.6 2.0793 181.675 2.0827 181.265 1.8641 81.043 1.8755 80.620 

0.8 1.9785 219.777 1.9826 219.241 1.8543 82.453 1.8676 81.978 

1.0 1.8997 253.521 1.9038 252.901 1.8532 83.013 1.8666 82.554 

0.2 2.3912 84.878 2.3942 84.739 2.0468 64.212 2.0557 63.939 

0.4 2.2105 137.677 2.2139 137.412 1.9007 77.184 1.9118 76.792 

0.6 2.0775 181.465 2.0809 181.083 1.8641 81.039 1.8756 80.600 

0.8 1.9766 219.468 1.9809 218.918 1.8543 82.449 1.8675 81.973 

1.0 1.8978 253.120 1.9021 252.483 1.8532 83.010 1.8666 82.548


TABLE III 

COMPARED RESULTS FOR CHARACTERISTIC IMPEDANCE OF STRIP LINE VERSUS 2 1 w w FOR PARAMETERS: 

r 3 , d / w1 

4 , t 1/ 

w1 

0 0. 

05 , h / d 0 

. 5 , s / w1 

1 

. 0 AND t 2 / t1 

2 

. 0 . 


w 2 HBEM FEM HBEM FEM 

w1 

eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] 

4.5 2.2100 168.392 2.2120 168.080 1.8799 80.064 1.8915 79.655 

5.0 2.1823 165.314 2.1845 165.004 1.8784 79.885 1.8899 79.479 

5.5 2.1605 162.805 2.1630 162.490 1.8772 79.743 1.8888 79.344 

6.0 2.1430 160.759 2.1458 160.438 1.8749 79.656 1.8876 79.242 

8.0 2.0981 155.589 2.1026 155.229 1.8718 79.427 1.8858 78.982 

10.0 2.0753 152.991 2.0805 152.606 1.8670 79.352 1.8848 78.898 

15.0 2.0492 150.340 2.0575 149.885 1.8598 79.343 1.8598 79.343 

TABLE IV 

COMPARED RESULTS FOR CHARACTERISTIC IMPEDANCE OF STRIP LINE VERSUS 1 1 w t FOR PARAMETERS: 

r 3 , d / w1 

4 , w 2 / w1 

6 6. 

0 , h / d 0 0. 

5 , s / w1 

1 

. 0 AND t 2 / t1 

2 

. 0 . 


t1 

HBEM FEM HBEM FEM 

w1 

eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] eff ef 

r 

Zc 

[ ] 

0.01 2.1768 162.282 2.1632 162.257 1.9031 82.453 1.9251 81.744 

0.02 2.1600 161.998 2.1583 161.804 1.8971 81.628 1.9147 81.052 

0.03 2.1526 161.595 2.1537 161.316 1.8899 80.913 1.9051 80.416 

0.04 2.1474 161.167 2.1496 160.857 1.8827 80.257 1.8963 79.801 

0.05 2.1430 160.759 2.1458 160.438 1.8749 79.656 1.8876 79.242 

0.06 2.1388 160.362 2.1422 160.021 1.8679 79.074 1.8800 78.670 

0.07 2.1352 159.976 2.1388 159.641 1.8610 78.519 1.8754 78.222 

0.08 2.1319 159.605 2.1356 159.271 1.8543 77.985 1.8645 77.640 

0.09 2.1288 159.250 2.1264 158.700 1.8447 77.470 1.8577 77.130 

0.10 2.1258 158.909 2.1293 158.592 1.8413 76.973 1.8507 76.655 

To validate the accuracy of the presented method, a 

comparison is made with the finite element method results 

obtained using the FEM software [19]. Those results are 

shown in Tables II-IV. 

As presented in the tables, numerical results for the 

effective dielectric permittivity and the characteristic 

impedance obtained using the HBEM are obviously in 

very good agreement with the FEM values (with few 

hundreds of thousands finite elements) with divergence 

less than 0.4% for the most of the cases. 

Distributions of characteristic impedance versus s / w1 

for different values of dielectric permittivity r are 

shown in Figs. 7 and 8, for: 

d / w1 

4 , h / d 0 0. 

5, 

t 1 / w1 

0 0. 

1 , w 2 / w1 

6 6. 

0 

and t 2 / t1 

2 2. 

0 . 

Fig.7 shows that increasing the values of parameter 

s / w1, 

decreasing the characteristic impedance for even 

mode. But, for the odd mode, Fig. 8, increasing the 

parameter s / w1, 

increasing the characteristic impedance 

too. The lowest values for the characteristic impedance 

are obtained for the highest value of dielectric 

permittivity. 

The obtained values are compared with the FEM 

results, also. A very good results agreement is obtained. 

Figure 7: Distribution of characteristic impedance versus s / w1 

for 

different values of dielectric permittivity (Even mode). 

IV. CONCLUSION 

A newly developed hybrid boundary element method is 

applied to quasi TEM analysis of 2D symmetrically 

coupled strip lines with finite grounded plane. Two quasistatic 

parameters are calculated: effective dielectric 

permittivity and characteristic impedance of the line. We 

have compared the values of parameters with those 

obtained by the finite element method. A very good 

agreement of the results is achieved: maximal relative

error of the characteristic impedance is less than 0.4%. 

Figure 8: Distribution of characteristic impedance versus s / w1 

for 

different values of dielectric permittivity (Odd mode). 

All calculations were performed on computer with dual 

core INTEL processor 2.8 GHz and 4 GB of RAM. 

This method can be successfully applied to static, 

stationary and quasi-stationary electromagnetic fields, as 

well as to the analysis of the fields in mechanics, fluid 

dynamics, conductive heat flow etc. 

Acknowledgement 

This research was partially supported by funding from 

the Serbian Ministry of Education and Science in the 

frame of the project TR 33008. 

REFERENCES 

[1] T. Fukuda, T. Sugie, K. Wakino, Y.-D. Lin, and T. Kitazawa, 

“Variational method of coupled strip lines with an inclined 

dielectric substrate,” in Asia Pacific Microwave Conference – 

APMC 2009, December 7-10, 2009, pp. 866-869. 

[2] R. F. Harrington, Field computation by Moment Methods. New 

York: Macmillan, 1968. 

[3] T. G. Bryant and J. A. Weiss, “Parameters of microstrip 

transmission lines and of coupled pairs of microstrip lines,” IEEE 

Trans. Microwave Theory Tech., vol. MMT-16, pp. 1021-1027, 

Dec. 1968. 

[4] A. Farrar and A. T. Adams, “Characteristic impedance of 

microstrip by the method of moments,” IEEE Trans. Microwave 

Theory Tech., vol. MMT-18, pp. 65-66, Jan. 1970. 

[5] K. Li, and Y. Fujii, “Indirect boundary element method of applied 


to generalized microstrip line analysis with applications to sideproximity 

effect in MMICs,” IEEE Trans. Microwave Theory and 

Techniques, vol. 40, pp. 237–244, Feb. 1992. 

[6] C.E. Smith, and R.S. Chang, “Microstrip transmission line with 

finite width dielectric,” IEEE Trans. Microwave Theory and 

Techniques, vol. 28, pp. 90–94, Feb. 1980. 

[7] J. Svacina, “Analytical models of width-limited microstrip lines,” 

Microwave and Optical Technology Letters, vol. 36, pp. 63–65, 

Jan. 2003. 

[8] J. Svacina, “New method for analysis of microstrip with finitewidth 

ground plane”, Microwave and Optical Technology Letters, 

Vol. 48, No. 2, pp. 396-399, Feb. 2006. 

[9] C.E. Smith, and R.S. Chang, “Microstrip transmission line with 

finite width dielectric and ground plane,” IEEE Trans. Microwave 

Theory and Techniques, vol. 33, pp. 835–839, Sept. 1985. 

[10] N. B. Raičević, S. R. Aleksić and S. S. Ilić, “A hybrid boundary 

element method for multilayer electrostatic and magnetostatic 

problems,” J. Electromagnetics, No. 30, pp. 507-524, 2010. 

[11] N. B. Raičević, S. R. Aleksić, “One method for electric field determination 

in the vicinity of infinitely thin electrode shells,” 

Journal Engineering Analysis with Boundary Elements, Elsevier, 

No. 34, pp. 97-104, 2010. 

[12] S. S. Ilić, N. B. Raičević, and S. R. Aleksić, “Application of new 

hybrid boundary element method on grounding systems,” in 14th 

International IGTE'10 Symp., Graz, Austria, Sept. 19-22, 2010, 

pp. 160-165. 

[13] N. B. Raičević, S. S. Ilić, and S. R. Aleksić, “Application of new 

hybrid boundary element method on the cable terminations,” in 

14th International IGTE'10 Symp., Graz, Austria, Sept. 19-22, 

2010, pp. 56-61. 

[14] S. S. Ilić, S. R. Aleksić, and N. B. Raičević, “TEM analysis of 

strip line with finite width of dielectric substrate by using new 

hybrid boundary element method,” in 10-th International Conf. 

on Applied Electromagnetics ПЕС 2011, Niš, Serbia, September 

25-29, Sept. 2011, CD Proc. О8-4. 

[15] B. Petković, S. Ilić, S. Aleksić, N. Raičević, and D. Antić, “A 

novel approach to the positive DC nonlinear corona design,” J. 

Electromagnetics, vol. 31, no. 7, pp. 505-524, Oct. 2011. 

[16] N. B. Raicevic, and S. S. Ilic, “One hybrid method application on 

complex media strip lines determination,” in 3rd International 

Congress on Advanced Electromagnetic Materials in Microwaves 

and Optics, METAMATERIALS 2009, London, United Kingdom, 

2009, pp. 698-700. 

[17] S. S. Ilić, M. T. Perić, S. R. Aleksić, and N. B. Raičević, “Quasi 

TEM analysis of 2D symmetrically coupled strip lines with 

infinite grounded plane using HBEM,” in Proc. XVII-th 

International Symposium on Electrical Apparatus and 

Technologies SIELA 2012, Bourgas, Bulgaria, 28–30 May, 2012, 

pp.147-155. 

[18] A. M. Abbosh, “Analytical closed-form solutions for different 

configurations of parallel-coupled microstrip lines”, in IET 

Microwaves, Antennas & Propagation, Vol. 3, Iss. 1, pp. 137- 

147, 2009. 

[19] D. Meeker, FEMM 4.2, Available: 

http://www.femm.info/wiki/Download


Design Approach for a Line-Start Internal Permanent 

Magnet Synchronous Motor 

1,2 V. Elistratova, 1 M. Hecquet, 1 P. Brochet, 2 D. Vizireanu and 2 M. Dessoude 

1 L2EP, Ecole Centrale de Lille, Cité Scientifique - BP 48 - 59651 Villeneuve d'Ascq, France 

2 EDF R&D, 1 avenue du Général de Gaulle, 92141 Clamart Cedex, France 

E-mail: vera.elistratova@ec-lille.fr 

Abstract—The work described in this paper deals with the analytical design and optimization of a line-start permanent 

magnet synchronous motor (LSPM) with radial magnet configuration. The design approach considers a LSPM as an 

induction motor (IM) combined with a permanent magnet rotor arrangement and takes into account the characteristics of 

both asynchronous and synchronous regimes and the motor thermal behavior. 

Index Terms — LSPM, Eco-design, Optimization, Multi-physical model. 


A large amount of the primary energy resources are 

converted into electric energy. As the main portion of 

greenhouse gases is produced by fossil fuels, electricity 

generation is responsible for the worldwide air pollution 

and global warming [1]. 

Electric motors are one of the main sources of 

electricity consumption (Fig.1) and this expense is up to 

70% in the industrial processes in Europe [2]. As so, the 

electric motors are responsible for a huge share of 

emission of CO2. Moreover, there are a total potential of 

improving the energy efficiency of applications using 

electric motors in the range of 20-30%. The main factors 

of such improvements are the use of variable speed drives 

and the use of energy efficient motors. Therefore, electric 

motor optimization for a better efficiency is essential for 

energy saving and the reduction of CO2 emissions. 

Figure 1: Distribution of the industrial electric consumption [2] 

Until now low- and medium - power induction motors 

(IMs) are widely used in many industrial applications, 

such as pumps and fans. In spite of the low cost, IMs 

normally suffer from relatively poor operational 

efficiency and power factor [3]. Although a permanent 

magnet synchronous machine (PMSM) can achieve high 

operational efficiency and power factor, it lacks the 

starting capability of the IM. For last few decades the 

line-start permanent magnet synchronous motor (LSPM) 

has been designed, constructed and tested. Compared 

with an IM, a LSPM has a lot of advantages: synchronous 

speed, higher power factor and efficiency, small size, etc. 

Besides it has an ability to start when connected directly 

to the mains. 

II. PERFORMANCE DESIGN 

The objective of this research is to find an analytical 

model for the LSPM with different magnet 

configurations. In the present paper the LSPM with radial 

magnet arrangement (Fig.2) is designed. This 

configuration has a number of advantages: simple and 

robust structure, better protection of buried permanent 

magnets (PMs) from demagnetization. Moreover, in 

comparison with the other topologies this one has one of 

the best asynchronous loading capabilities [4, 10]. 

Figure 2: LSPM architecture under study 

An analytical model of a PMSM with the same rotor 

architecture may be found in [6, 11]. In our case this 

model can be applied under the assumption that during 

the steady-state regime a LSPM operates as a PMSM. 

In general, the structure of a LSPM is similar to an IM 

but the rotor includes both cage and inserted permanent 

magnets. Hence, the LSPM combines IMs and PMSM 

structure features: the LSPM will start due to the resultant 

of two torque components i.e. the asynchronous torque 

and magnet opponent torque (braking torque). If for the 

entire speed range during starting the asynchronous 

torque is higher than the sum of the braking and load 

torque, the motor will reach the synchronous regime [4]. 

Therefore, the design of a LSPM has to take into account 

both types of performances: starting capacity and 

efficiency in steady state regime. 

To simplify the LSPM design process, the proposed 

procedure applied in this article treats separately the 

running modes: the motor can be considered as an IM 

during its start and synchronization and as a PMSM 

during the steady-state regime. Figure 3 shows the 

workflow diagram applied for the design approach.

Figure 3: Diagram summarizing the design procedure 

At the first stage of the design we enter the data 

concerning the power and the stator parameters. At this 

stage the same design methodology as for an IM could be 

applied. For economic reasons, the stator of the LSPM is 

identical to the IM of the same power. 

The stage 2 is to choose a squirrel cage that gives the 

value of the rotor cage resistance, the level of saturation 

in a rotor tooth and the number of rotor bars. 

At the next stage a configuration of the permanent 

magnets has to be chosen. As soon as we know the 

geometry of the rotor and the PMs, the d- and q-axis 

reactances Xd and Xq and the no-load EMF could be 

computed. Using these parameters, it can be simulated 

the start of the LSPM taking into account the braking 

torque caused by PMs. If the motor is not able to start, the 

designer has to go back either to the Stage 2 in order to 

change the rotor squirrel cage to improve the starting 

torque or to the Stage 3 to change the configuration of 

PMs and to reduce the breaking torque. 

Finally, after the successful start of the LSPM (Stage 

5), performances and steady-state characteristics are 

calculated. If they are acceptable, the design solution is to 

be considered as one for the optimization procedure (see 

chapters III, IV). Otherwise, the design procedure is to be 

repeat starting from the Stage 2. 

Each stage of the diagram will be further detailed. 

A. Asynchronous Torque 

The electromagnetic design of induction motor is a 

well-known problem. In this paper the asynchronous part 

design is based on the methodology proposed in [8]. 

The classical expression for the asynchronous torque 

can be written as follows: 


' 

2 R2 

3pV 

Tc 

 

s 

' 

R2 

2 ' 2 

2 f ( Rsc ) ( X1 cX2) 

s 

 

 

where V is the phase RMS voltage, p is the number of 

poles, c=1+X1σ/Xm, X1σ , X`2σ are the stator and rotor 

leakage reactances, Xm is the magnetizing reactance. 

B. Braking torque 

The braking torque is found as a function of the back- 

EMF and the stator resistance [5]: 

T 

br 

2 

2 2 2 

(1 s)( Rs Xsq(1 s) 

) 

s 

s 

2 

RsXsd Xsq 2 2 

s 

3p 

ER 

 

2 ( (1 ) ) 

where Rs is the stator resistance, ωs is synchronous 

electrical speed, E is the RMS value of back-EMF, s is 

the slip, Xsd, Xsq are the direct and quadrature 

synchronous reactances respectively. 

The braking torque peaks the maximum at low speed 

and declines near synchronous speed. 

C. Steady state regime 

During the steady state regime the rotor of LSPM 

rotates at synchronous speed and its cage has no 

influence. In this state the performance of the machine 

could be calculated as for PMSM. 

As stated in [10] the RMS armature current is a 

function of the motor equivalent electrical parameters: 

2 2 

Ia I ad Iaq, 

(3) 

where the axis currents are 

I 

ad 

I 

V( X cos R sin ) 

EX 

ad 

sq s 

2 

Xsq Xsd Rs 

sq 

V( R cos X sin ) 

ER 

, 

s sd 

2 

Xsq Xsd Rs 

s 

where δ is the load angle. 

The input power of the motor is 

2 

in 3[ aq ad aq( sd sq) s a ], 

, 

(1) 

(2) 

(4) 

(5) 

P I EI I X X R I 

(6) 

Neglecting the stator core losses the electromagnetic 

power is 

P 3[ I EI I ( X X )]. 

(7) 

elm aq ad aq sd sq 

The electromagnetic torque developed by a PMSM is 

Pelm 

Telm 

, 

(8) 

2 

ns 

where ns is the synchronous speed of the rotating 

magnetic field.

Taking into consideration the Joule losses Pj, the 

mechanical losses Pm and the stator core losses Ps, the 

efficiency can be expressed as 

Pin PjPsPm . 

(9) 

P 

in 

D. Calculation of the back-EMF and the direct and 

quadrature synchronous reactances 

The analytical model of the d-q machine parameters is 

interesting as it provides the fast evaluation of LSPM 

performances at steady-state regime, the obtainment of all 

the characteristics and their integration into the 

optimization procedure. For example, it permits us to find 

the optimal volume of magnets in terms of the improved 

efficiency and reduced braking torque. 

To compute the d- and q-axis reactances Xd and Xq and 

the back-EMF, the finite element simulation or 

experimental testing are usually used [3-5, 9, 12]. 

However, there are a number of papers where all these 

parameters are analytically expressed as function of the 

studied machine geometry [6, 10, 11]. 

According to the model in [6] the flat-topped value of 

the flux density in the air gap is 

B 

ag 

2Br 

emehhmp 

(4eagehhmmp2eagempRh , 

e e R 2 e e pR e e R ) 

ag m h m h r m h r 

(10) 

where em is the magnet width, hm is the magnet height, eag 

is the air gap length, eh is the hub thickness, Rh is the 

external hub radius (in our case as the shaft is made of the 

non-magnetic material eh = Rh), Rr is the rotor radius, μ0 

is the permeability of vacuum, μm is the magnet relative 

permeability, Br is the remanent magnetization, 

α=em/(2∙Rh), β=/(2·Rr) are geometrical coefficients. 

The first harmonic of the flux density in the air gap is 

4 i Bag1 Bag 

sin , 

(11) 

2 

where αi=2/π is the ratio of the average-to-maximum 

value of the normal component of the air gap magnetic 

flux density. 

The first harmonic of the back EMF can be expressed 

32RbLact f NskwBremehhmsin( p) 

E 

, 

pe ( (4 eh pe( 2 p) R) ehR) 

ag h m m m h h m r 

(12) 

where Lact is the rotor active length, Ns is the number of 

turns in series per phase, kw is the global winding 

coefficient. 

The direct and quadrature synchronous inductances 

could be found as a function of the machine geometry 

and winding arrangement: 


2 

 

4sin( p 

) 

2p 

p 

 

 

 

Ld 

 

 

2 eagempRhemeagRh) 

 

 

 

2 2 

6Lact0kwNsRb 

Lq 

2p sin(2 p) 

2 2 

eag p 

 

 

2(6eag Rb2(4 eag Rb)cos( 

) 

 

 

(2 eag Rb)(cos(2 

) sin(2 ))) 

 

. 

2 

p(2 eag Rb) 

eag 

 

 

 

2 2 

6Lact0kNR w s b sin(2 p) 8eeR 

m h r sin( p) 

, 

2 2 

eag p p(4eagehhmpehemRr (13) 

(14) 

III. OPTIMIZATION PROBLEM 

The goal of optimization process consists in finding the 

set of optimal configurations R * taking into account 

parameters and constraints imposed by the design 

specification. Table I presents the specification for the 

studied LSPM. In the presented study the dependency 

between the efficiency and the magnet braking torque is 

analyzed. 

Table I. Specification of the designed LSPM machine 

Parameter Value/Feasible interval 

Power, [kW] 7.5 

Voltage LL, [V] 400 

Supply frequency f, [Hz] 50 

Rated speed, [rpm] 1500 

Height of the shaft axe, [mm] 132 

Rated Torque, [Nm] 47.75 

Overload conditions, Tmax/Trated 

≥1.6 

Ambient temperature, Tamb [°C] [-10; 40] 

Stator winding temperature rise 

average, [K] 

80 

Stator winding hot spot, [K] 90 

Load torque, [p.u.] 

Linearly from zero to nominal 

speed, starting from 0.8pu to 1 

p.u. 

Power factor, cosφ ≥0.8 

Efficiency η, [%] To be maximized 

Geometry of permanent magnets Radial magnet configuration 

The design vector X =[x1, x2,…, xn] T identifies the set 

of design variables. The design variables can be freely 

varied by the designer to define a designed object [7]. 

The permanent magnet geometry is analytically 

predetermined form the imposed specification (Table I). 

Consequently, the design vector of the studied problem is 

composed of 3 variables: x1 – length of the air gap eag; x2 

– magnet height hm; x3 – magnet width em. According to 

equations (12-14) these 3 parameters are sufficient to 

compute the d- and q-axis reactances Xd and Xq and the 

back-EMF. Due to manufacturing constraints all of the 

components of design vector X are discontinuous and 

standardized.

Formally, the problem is expressed as follows: 

 

minimize1 

η, Tbr , 

X 

 

(15) 

subject to GX ( )= g 1( X), g 2( X),..., g n( 

X) 

0,n 

=2. 

Electromagnetic constraints of the problem G(X) are 

specified in the Table II. 

Table II. Constraints of the optimization problem 

Function Constraint level 

Power factor cosφ, p.u. ≥0.8 

≥1.6 

Tmax/Trated 

Where {Tmax/Trated, cosϕ} are the feasible domains for 

the maximum torque ratio for synchronous operation and 

power factor. 

Taking into consideration the fact that all the 

components are discrete, in order to find R* a lot of 

configurations have to be investigated. 

IV. OPTIMIZATION TECHNIQUE 

The optimization method applied for the considering 

problem (15) is the exhaustive enumeration (EE) [7, 13]. 

It is an exact method with evaluations of all possible 

combinations of the PM dimensions and air gap length. 

The method doesn’t have any heuristic rules at all. 

Because of the presence of several objective functions, 

the aim of multi-objective evolutionary algorithms is to 

find compromise solutions rather than a single optimal 

point as in scalar optimization problems [14]. 

These tradeoff solutions are usually called Pareto 

optimal solutions. The EE was applied in order to obtain 

a genuine Pareto-Front. The method is not pretended to 

be the best one in terms of total time of calculation, but 

on the other hand, it gives reliable results. 

Input parameters were: 

Design vector: 

X =[x1, x2,…, xn] T in our case n = 3. 

Objective functions: 

F(X) = {f1(X),f2(X),…, fm(X)} in our case m=2 

and F(X) = {(1- η), Tbr}. 

Constraints: 

G(X) = {g1(X), g2(X),…, gk(X)} in our case k =2. 

The feasible set Ω= {ω1, ω2,…, ωn},where ωi is the 

subset which contains all feasible values for the 

component xi of the design vector, for i=1…n. In 

our case n = 3. As all of the components of design 

vector X are discrete, Ω is a finite set that is 

composed of the possible standardized values. 

Output parameters: 

The set of optimal solutions: 

R * = {Xi * X 0 |G (Xi * ) ≤, for i=1,..m}, where 

Xi * is the degenerate interval, and each component 

* 

of X is a Pareto optimal solution. Therefore Xi 

has following features: 


* 

fl( ) fl( i) for l 1... 

m, 

* 

f j( ) f j( i) 

for at least one index j. 

The problem (15) was treated and a total of 360 

combinations has been enumerated. Among these 360 

combinations there are 160 that belong to the feasible 

domain defined by optimization constraints. In Fig.4 the 

feasible set of solutions for the EE and the Pareto frontier 

are presented. 

Figure 4: Pareto front of efficiency versus braking torque 

V. DESIGN RESULTS 

A boundary point of the maximal efficiency from the 

Pareto frontier has been chosen for a deeper investigation. 

Based on this optimal solution and solving the set of 

equations (1-14) all the characteristics for steady-state 

regime and optimal dimensions of permanent magnets 

and air gap length have been found (Table III). 

Table III. Optimal solution for of the designed LSPM 


Efficiency η, [%] 91.2 

Braking torque, Nm 11.66 

Power factor cosφ, p.u. 0.983 

Air gap length eag, mm 0.7 

Magnet height hm, mm 29.0 

Magnet width em, mm 15.0 

Overload condition, Tmax/Trated 

1.783 

Table III shows that the efficiency of the designed 

LSPM compared with a premium efficiency class 

induction motor (PEIM) is greater than 0.8% [20]. The 

power factor of the PEIM (0.9 p.u.) is much lower 

compared with the designed LSPM (0.983 p.u.). It means 

the LSPM can achieve a very high power factor in a wide 

output power range. This feature assists in saving energy 

when the motor is running at different loads. 

Based on the designed data the static and dynamic 

characteristics were obtained. Figures 5-7 show that the 

designed LSPM is able to start and synchronize even at 

85% of the rated voltage.

Figure 5: Torque versus speed curve of the studied motor 

with supplied phase voltage equal 231V 

Figure 6: Torque versus speed curve of the studied motor 

with supplied phase voltage equal 85% * 231V 

Figure 7: Motor speed during transient start 

supplied with different voltages 

VI. THERMAL MODEL 

An increase in motor temperature can cause the stator 

winding insulation degradation and permanent magnet 

material decreased performances. According to the design 

specification (Table I), the acceptable heating in the 

LSPM doesn’t have to exceed 90K. To predict the motor 

transient thermal behavior an analytical model based on 

the general cylindrical component [21] was developed 

(Fig. 8). 

Figure 8: A simplified model of LSPM as the heating body 


This model corresponds to the system of equations: 

dcu 

cu 12 ( ) 1 

 

P A cu st A cu C1 , 

dt 

 

dst 

P 2 12 

st A st A ( cu st ) C2 . 

 

dt 

(16) 

where ∆θcu and ∆θst are the average heating in the copper 

and respectively the stator laminations, A1, A2, A12 are 

the heat transfer coefficients, C1 and C2 represent the heat 

capacities of stator core and stator winding. In order to 

determine the temperature, a simplified equivalent 

thermal network (ETN) model of the LSPM is considered 

(Fig. 9). 

Figure 9: Simplified equivalent thermal network of LSPM 

( 

cu, a cu, c ) cu -cu, a a, in - cu, c s, st Pcu, 

 

( 

cu, a rot, c c, f ) s, st cu, c curot, c rot 

 

c, ff Ps, 

st, 

(17) 

 

 

( 

rot, a rot, c ) s, st rot, a a, in rot, s s, st Prot 

 

 

( cu, a rot, a a, f ) a, in cu, 

a curot, a rot 

a, ff Pa, 

in, 

 

( c, fa, f f) fc, fs, sta, fa, in0. 

where Δθcu is the heating in the copper winding; Δθs,st - 

heating in the steel stator pack, Δθrot - heating in the rotor; 

Δθa,in - heating in the air gap; Δθf - heating in the motor 

case, Рcu - source of losses in the copper winding, Рs,st - 

source of losses in the stator pack, Рrot - source of losses 

in the rotor, Рa,in - source of mechanical and additional 

losses, Λcu,c - thermal conductivity between the slot 

winding and stator core, Λcu,a – thermal conductivity 

between the winding and the air gap, Λrot,a – thermal 

conductivity between the rotor and the air gap, Λrot,с – 

thermal conductivity between the rotor and the stator 

core, Λa,f – thermal conductivity between the air inside 

the motor and the motor case, Λс,f – thermal conductivity 

between the stator core and the motor frame; Λf – thermal 

conductivity between the motor frame and the external 

air. 

The solution of the systems (16, 17) enabled us to 

model overheating in the main parts of the designed 

motor (Figs. 10, 11).

Figure 10: The increase of winding temperature 

Figure 11: The increase of rotor core temperature 

According to figures 10, 11 maximal overheating in 

winding is 79.4°C, maximal overheating in rotor is about 

of 26.3°C that is in compliance with the specification 

requirements (Table I). 

VII. CONCLUSION 

A design method for a LSPM motor considering the 

asynchronous starting capacity and the synchronous 

steady state performances is proposed in order to find out 

an optimal design solution for the given motor topology. 

The approach is based on the design of an asynchronous 

machine incorporating the effect of magnets. The present 

analytical model takes into account the radial magnet 

topology and is to be extended for the other LSPM 

architectures. 

It has been shown that the efficiency and the power 

factor of the designed LSPM is greater compared with a 

PEIM of the same power. 

Thereafter, an analytical thermal model was developed. 

The proposed thermal model allows predicting the 

overheating in the main parts of the motor. During the 

design process the thermal model didn’t take part in 

optimization procedure. 

In future investigations it might be possible to combine 

the electro-magnetic and thermal optimization problems 

in order to integrate them into optimization procedure. 

The verification of the analytical approach will be 

provided by both finite element and experimental models. 


REFERENCES 

[1] Key world energy statistics. International Energy Agency, 2010. 

[2] La rentabilité énergétique les entrainements, Mesures 803, Mars 

2008, www.mesures.com. 

[3] Jian Li and Jungtae Song and Yunhyun Cho. A High-Performance 

Line-Start Permanent Magnet Synchronous Motor Amended From 

a Small Industrial Three-Phase Induction Motor. In Industrial 

Electronics, 2010 IEEE International Symposium, pp. 1308 -1313. 

[4] T. Ruan, H. Pan, Y. Xia « Design and Analysis of Two Different 

Line-Start PM Synchronous Motors», Artificial Intelligence, 

Management Science and Electronic Commerce (AIMSEC), 2011. 

[5] Soulard, J.; Nee, H.-P.; , "Study of the synchronization of linestart 

permanent magnet synchronous motors," Industry 

Applications Conference, 2000. Conference Record of the 2000 

IEEE , vol.1, no., pp.424-431 vol.1, 2000. 

[6] X. Jannot, J.-C. Vannier, J. Saint-Michel and M. Gabsi, An 

Analytical Model for Interior Permanent-Magnet Synchronous 

Machine with Circumferential Magnetization Design, IEEE, 

10.1109/ELECTROMOTION.2009.5259155, July 2009. 

[7] P.Venkataraman, Applied Optimization with 

Matlab Programming, A Wiley - Interscience publication, John 

Wiley & Sons, New York, 2001. 

[8] I.P. Kopylov, Electric Machines: M., Energoatomizdat, 1986. 

[9] K. Kurihara, M. Azizur Rahman, High Efficiency Line-Start 

Interior Permanent Magnet Synchronous Motors, IEEE Trans. 

Industry Applications, Vol. 40 Issue 3, May 2004. 

[10] J.F.Gieras, M. Wing, Permanent Magnet Motor Technology, 

USA, Marcel Dekker, 2002. 

[11] D.Fodorean, A. Miraoui, Dimensionnement rapide des machines 

synchrones à aimants permanents (MSAP), Techniques de 

l’ingénieur, Nov. 10, 2009. 

[12] H-P. Nee, L. Lefevre, P. Thelin, J. Soulard, Determination of d 

and q reactances of permanent magnet synchronous motors 

without measurements of the rotor position, IEEE Trans. on 

Industry Applications, Vol. 36, No. 5, 1330-1335, Oct. 2000. 

[13] D. Samarkanov, F. Gillon, P.Brochet, D. Laloy , Optimal design 

of induction machine using interval algorithms, COMPEL: The 

International Journal for Computation and Mathematics in 

Electrical and Electronic Engineering, Vol. 31, N°.5, pages. 1492 - 

1502, ISBN. 0332-1649, 8-2012. 

[14] P. Alotto, U. Baumgartner, F. Freschi, M. Jaindl, A. Köstinger, 

Ch. Magele, W. Renhart, and M. Repetto, SMES Benchmark 

Extended: Introducing Pareto Optimal Solutions Into TEAM22, 

IEEE Transactions on Magnetics, Vol. 44, No.6, pp. 1066-1069, 

2008. 

[15] Mellor, P.H.; Roberts, D.; Turner, D.R.; , "Lumped parameter 

thermal model for electrical machines of TEFC design," Electric 

Power Applications, IEE Proceedings B , vol.138, no.5, pp.205- 

218, Sep 1991. 

[16] IEC 60034-30, Standard on efficiency classes for low voltage AC 

motors, 2008. 

[17] D. Stoia, M. Antonoaie, D. Ilea, M. Cernat, Design of Line Start 

PM Motors with High Power Factor, Proc. POWERENG 2007, 

Setubal, Portugal, 12-14 April, 2007, published on CD-Rom, 

IEEE Catalog Number 07EX1654C, ISBN: 1-4244-0895-4, paper 

186. 

[18] T. Miller, Synchronization of line-start permanent magnet AC 

motor, IEEE Trans. Power Apparatus and Systems, vol. PAS-103, 

July 1984, pp 1822-1828. 

[19] T. Tran, S. Brisset, P. Brochet, A Benchmark for Multi-objective, 

Multi-Level and Combinatorial Optimizations of a Safety 

Isolating Transformer, COMPUMAG 2007, Aachen, Germany, 

6- 2007 

[20] X. Feng, L. Liu, J. Kang, Y. Zhang, Super Premium Efficient 

Line Start-up Permanent Magnet Synchronous Motor, Proc. Of 

XIX International Conference on Electrical Machines, ICEM2010, 

Roma, Italy, Sept. 6-8, 2010. 

[21] A.I. Borisenko Cooling of industrial electrical machinery, 

Energoatomizdat, 1983.


Speed-up of Nonlinear Magnetic Field Analysis using a Modified 

Fixed-Point Method 

Norio Takahashi 1 , Kousuke Shimomura 1 , Daisuke Miyagi 2 and Hiroyuki Kaimori 3 

1 Dept. Electrical and Electronic Eng., Okayama University, Okayama 700-8530 Japan 

2 Dept. Electrical Eng., Tohoku University, Sendai 980-8579 Japan 

3 Science Solutions Int. Lab., Inc., Tokyo 153-0065 Japan 

The nonlinear finite element analysis of magnetic fields using the Fixed-Point method (FPM) requires a number of iterations and 

long CPU time compared with those using the Newton-Raphson method (NRM). On the other hand, the Fixed-Point method has an 

advantage that the convergence can be obtained even for a complicated nonlinear anisotropy problem, of which the convergence is 

very difficult using a conventional Newton-Raphson method. Moreover, it has an advantage that a software can be easily obtained by 

slightly modifying a linear FEM software. We then achieved the speed-up of the Fixed-Point method by updating the reluctivity at each 

iteration (This is called a modified Fixed-Point method). It is shown that the formulation of the Fixed-Point method using the 

derivative of reluctivity is almost the same as that of the Newton-Raphson method. The convergence properties of these methods are 

compared. It is shown that the modified Fixed-Point method has an advantage that the programming is easy and it has a similar 

convergence property to the Newton-Raphson method for an isotropic nonlinear problem. 

Index Terms—finite element method, Fixed-Point method, Newton-Raphson method, nonlinear electromagnetic analysis 


The Fixed-Point method [1,2] has an advantage that the 

convergence can be obtained even for a complicated nonlinear 

problems [3] such as the analysis considering vector magnetic 

properties treating an anisotropic material [4, 5], in which the 

convergence is sometimes difficult. In addition, it has an 

advantage that the software for nonlinear analysis can be 

easily obtained by adding a small change to that for linear 

analysis. But, the Fixed-Point method requires a number of 

iterations and long CPU time compared with those of the 

Newton-Raphson method [6]. It is reported that the CPU time 

can be reduced by using a constant reluctivity in the 

beginning of nonlinear iterations [7,8 ]. However, nearly ten 

times longer CPU time is still necessary compared with the 

Newton-Raphson method. 

In this paper, a modified Fixed-Point method, which 

updates the derivative of reluctivity at each iteration, is 

proposed. Furthermore, it is pointed out that the formulation of 

the Fixed-Point method using the derivative of reluctivity is 

the same as the Newton-Raphson method. The convergence 

characteristic of the newly proposed Fixed-Point method is 

compared with those of the Newton-Raphson method. 

II. FORMULATION OF NRM AND FPM 

A. Newton-Raphson Method 

There are two kinds of methods which deal with the 

nonlinearity in the Newton-Raphson method (NRM). One is 

the method A (NRM(B 2 )) which uses ν-B 2 curve. In this 

method, the magnetic field strength H is given by 

2 

H ( B ) B 

(1) 

B is the flux density. The reluctivity ν is given by 

2 H( 

B) 

( B ) 

(2) 

B 

The other is the method B (NRM(B)) which uses the B-H 

curve directly. In this method, the magnetic field strength H is 

given by 

B 

H H( 

B ) 

(3) 

B 

1) Method A (NRM(B 2 ) 

The static magnetic field equation can be written as follows 

in the case of the Newton-Raphson method using the -B 2 

curve: 

H 

( 

A) 

J 

(4) 

0 

where, A is the magnetic vector potential. J0 is the forced 

current density. The Galerkin equation G * i(A (k) ) of (4) is given 

by 

* ( k ) 

( k 1) 

( k 1) 

Gi ( A ) 

N i ( A ) dV N iJ 

0dV 

(5) 

where, Ni is the interpolation function of the edge element. 

The residual Gi(A) at the k-th nonlinear iteration is given by 

( k ) * ( k ) 

( k ) 

G ( A ) G i ( A ) G 

( A ) 

i 

( k ) 

j 

i 

( k 1) 

( k 1) 

N i ( A ) dV N iJ 

0dV 

 

( k 1) 

( k 1) 

N i ( A ) dV A 

A 

* ( k ) 

G i ( A ) N ( 

 

i 

( k 1) 

Ν ) dV 

( k 1) 

 

( k 1) 

( k ) 

N dV 

i 

A A 

( k ) 

i 

A 

j 

2 

 

B 

 

B 

2B 

(7) 

2 

2 

Aj 

B 

Aj 

B 

Aj 

where, A, ν etc. in (6) and (7) are values at the k-th iteration. 

∂ν/∂B 2 is the term which represents nonlinear magnetic 

properties. The process of calculation is as follows: 

1) The initial value of ν is determined. 

2) δA (0) is set to zero. 

3) A is updated by A (k) =A (k-1) +δA (k) using δA (k) calculated by 

(6). 

j 

( k ) 

i 

1 

(6)

4) ν (k) is calculated using the ν-B 2 curve from B obtained by 

A (k) . 

5) The process from 3) to 5) is repeated. 

6) It is judged to be converged if δB(A (k) ) is less than a 

specified small value. 

2) Method B (NRM(B)) 


in the case of the Newton-Raphson method using the B-H 

curve: 

H J 

(8) 

0 

The Galerkin equation G * i(A) of (8) is given by 

 

* ( k ) 

G i ( A ) 

N i HdV 

N iJ 

0dV 

(9) 

The residual Gi(A) at the k-th nonlinear iteration is given by 

( k ) * ( k ) 

( k ) 

G ( A ) G i ( A ) G 

( A ) 

i 

i 

( k 1) 

( k ) 

N dV dV dV 

i H N iJ 

N i H 

0 

( k 1) 

* ( k ) 

H 

( B ) ( k ) (10) 

G i ( A ) N 

dV 

i 

Bi 

B 

( k 1) 

* ( k ) 

H 

( B ) 

( k ) 

G i ( A ) N i A 

dV 

B 

∂H(B)/∂B is the term which represents nonlinear magnetic 

properties. 

The process of calculation is as follows: 

1) The initial value of ∂H(B (0) )/∂B is determined. 

2) δA (0) is set to zero. 

3) A is updated by A (k) =A (k-1) +δA (k) using δA (k) calculated by 

(10). 

4) H (k) is calculated using the B-H curve from B obtained by 

A (k) . 

5) The process from 3) to 5) is repeated. 

6) It is judged to be converged if δB(A (k) ) is less than a 

specified small value. 

B. Fixed-Point Method 

In the Newton-Raphson method, the reluctivity is updated 

in each nonlinear iteration as explained above. In the Fixed- 

Point method, the reluctivity is fixed at the first step and it is 

not changed during the nonlinear iterations. 

According to the concept of the Fixed-Point method [1], the 

magnetic field strength is given by 

H( B) 

ν B H 

(11) 

FP 

FP 

where, FP is the Fixed-Point reluctivity which is constant 

during the nonlinear iterations, HFP is an additional magnetic 

field strength. 


in the case of the Fixed-Point method: 

FP 0 ) 

( J H B FP 

(12) 

where, HFP (k) at the k-th nonlinear iteration can be obtained by 

the following equation: 

( k ) 

( k1) 

( k1) 

H FP H( 

B ) νFPB 

(13) 

where, H(B (k-1) ) is the magnetic field strength vector on the B- 

H curve corresponding to the flux density B (k-1) at the (k-1)-th 

nonlinear iteration. HFP (k) converges to some value after 

iterations. The residual Gi(A) of (12) is given by 

 


 

 

( k ) 

Gi 

( A) 

( k ) 

N i ( FP 

A ) dV N iJ 

0dV 

(14) 

 

( k ) 

N i H FP dV 

By substituting HFP (k) in (13) into HFP (k) in (14), we obtain 

( k ) * ( k ) 

G ( A ) G ( A ) N H 

i 

i 

) 

dV 

* ( k ) 

( k 1) 

( k 1) 

G ( A ) N ( H( 

B ) 

B ) dV 

i 

 

 

 

* ( k ) 

( k ) 

G ( A ) N ( A 

) dV 

i 

i 

i 

i 

 

 

FP 

FP 

FP 

i 

i 

i 

FP 

( k 

FP 

( k ) 

( k 1) 

N ( A ) dV N J dV N H ( B ) dV 

( k 1) 

N ( A ) dV 

i 0 

( k 1) 

( k 1) 

N ( A 

) dV N J dV N H( 

B ) dV 

where, δA (k) =A (k) -A (k-1) . 

Gi * (A (k) ) is given by 

 

 

* 

Gi i FP 

i 0 

 

 

i 

0 

FP 

 

 

 

 

i 

i 

2 

(15) 

( k ) 

( k ) 

( A ) 

N ( 

 

A ) dV N J dV 

(16) 

In the actual calculation, (14) is used in the Fixed-Point 

method. 


1) The initial value of FP is determined. 

2) HFP (0) is set to zero. 

3) B (k) is obtained from A (k) which is calculated by (14). 

4) HFP (k) is obtained by (13). 

5) The right hand side of (14) is updated and the process from 

3) to 5) is repeated. 

6) It is judged to be converged if the change of B (k) is less 

than the specified small value. 

According to (15), we found that HFP (k) is given by 

H 

A 

H H 

(17) 

( k ) 

( k ) 

( k ) 

( k1) 

FP FP 

FP 

FP 

(17) means that the difference HFP (k) is the same as H in (9) 

of the Newton-Raphson method and it can be used as the 

judgment of the convergence. 

Fig.1 shows the concept of the nonlinear magnetic field 

analysis using the Fixed-Point method. A white circle on the B 

axis is a convergence target. In this method, the reluctivity FP 

shown in Fig.1 is given as an initial value, and FP is not 

changed during the iterations. The flux density B (1) is obtained 

by the linear magnetic field analysis. Next, the HFP (1) which 

corresponds to the flux density B (1) on the B-H curve and 

FPB (1) on the line of FP shown in Fig.1(a) is obtained. During 

iterations, HFP (k) becomes the same value, which means the 

difference HFP (k) becomes almost zero. Then, the converged 

result can be obtained. 

C. Modified Fixed-Point Method 

In the modified Fixed-Point method, the derivative of 

reluctivity is updated at each iteration. In this expression, the 

HFP (k) at the k-th nonlinear iteration in (13) can be rewritten by 

the following equation: 

( k 1) 

( k ) 

( k 1) 

H( 

B ) ( k 1) 

H H( 

B ) B 

FP 

(18) 

B 

The residual Gi(A (k) ) is given by 

k 

k 

k 

k 

k 

Gi 

G i i 

dV 

( 1) 

( ) * ( ) 

( 1) 

H( 

B ) ( 1) 

 

( A ) ( A ) N H( 

B ) B (19) 

 

B 

 

(19) can be written as follows:

H 

FPB (1) FPB (1) 

HB (1) HB νFP 

(1) νFP 

H (1) 

FP 

H 

FPB (2) FPB (2) 

HB (2) HB (2) 

H (1) 

FP 

H (2) 

FP 

H 

FPB (2) FPB (2) 

HB (3) HB (3) 

H (2) 

FP 

H (3) 

FP 

ν 

ν 

FP 

ν 

FP 

FP 

ν 

ν 

FP 

FP 

(a) 

B (1) B (1) 

B (2) B (2) 

(b) 

 

H 

B (3) B (3) 

 

H 

( 3 ) 

FP 

 

H 

( 2 ) 

FP 

( 1 ) 

FP 

B-H curve 

( k 1) 

( k ) H 

( B ) 

( k ) 

G ( A ) NAdV NJdV 

i 

i 

i 0 

B 

 

( k 1) 

( k 1) 

H 

( B ) 

( k 1) 

 

 

N i H( 

B ) dV 

N i 

A dV 

B 

(20) 

( k 1) 

( k 1) 

H 

( B ) 

( k ) 

 

N H( 

B ) dV N J dV 

N A 

dV 

i 

i 0 

i 

B 

 

( k 1) 

* ( k ) H 

( B ) 

( k ) 

G ( A ) 

N A 

dV 

i 

i 

B 

 

(10) and (20) denote that the formulation of the modified 

Fixed-Point method is the same as that of the Newton- 

Raphson method. 

In the actual calculation of the modified Fixed-Point 

method, (19) is used. 


1) The initial value of ∂H(B (0) )/∂B is determined. 

2) HFP (0) is set to zero. 

3) B (k) is obtained from A (k) which is calculated by (19). 

4) HFP (k) is obtained by (18). 

B 

B-H curve 

B-H curve 

(c) 

Fig. 1 Conceptual diagram of Fixed-Point method. (a) 1 st step. (b) 2 nd step. 

(c) 3 rd step. 

B 


5) The right hand side of (19) is updated and the process from 

3) to 5) is repeated. 

6) It is judged to be converged if the change of B (k) is less 

than the specified small value. 

Fig.2 shows the concept of the nonlinear magnetic field 

analysis using the modified Fixed-Point method. In this 

method, the reluctivity νFP shown in Fig.2 (a) is given as an 

initial value, and the derivative ∂H/∂B is updated at each 

iteration. At the initial iteration, the linear magnetic field 

analysis is carried out using the given ∂H/∂B, and the flux 

density B (1) is obtained. Next, H(B (1) ) corresponding to the 

flux density B (1) on the B-H curve and 

∂H(B (1) )/∂B·B (1) =VFPB (1) on the line ∂H(B (1) )/∂B shown in 

Fig.2(a) is obtained. At the first step, HFP (k) =HFP (1) following 

the definition of HFP (1) in (17). The iteration is carried out 

H(B (1) H(B ) (1) ) 

H 

FPB (1) FPB (1) 

H (1) 

FP 

0 

H 

H(B (2) H(B ) (2) ) 

FPB (2) FPB (2) 

H (1) 

FP 

0 

H 

H(B (3) H(B ) (3) ) 

FPB (3) FPB (3) 

H (2) 

FP 

0 

( 1 ) 

H ( B ) 

 

B 

( 2 ) 

B (1) B (1) 

(a) 

H ( B ) 

( 2 ) 

H FP 

 

B 

 

H 

 

H 

( 1 ) 

FP 

( 2 ) 

FP 

B (2) B 

( 2) 

H( 

B ) 

B 

(2) 

( 2) 

H( 

B ) 

B 

(b) 

( 3 ) 

H ( B ) 

( 3 ) 

H FP 

 

B 

B (3) B (3) 

( 3 ) 

H ( B ) 

 

B 

H ( B ) 

 

B 

 

H 

B-H curve 

( 1 ) 

B 

B-H curve 

B 

B-H curve 

( 3 ) 

FP 

B 

(c) 

Fig. 2 Conceptual diagram of Modified Fixed-Point method. (a) 1 st step. (b) 

2 nd step. (c) 3 rd step. 

3

until δHFP becomes near to zero. H (which corresponds to 

HFP in (17)) can be directly obtained by using the Newton- 

Raphson method as shown in (10). The modified Fixed-Point 

method needs two steps (Eqs. (18) and (19)), but the concept 

is the same as that of the Newton- Raphson method. 

III. ANALYZED MODEL 

The modified Fixed-Point method is applied to the analysis 

of the magnetic field in the billet heater model [9] shown in 

Fig.3. Analysis domain of the model is 1/8. The material of 

the yoke is 35A230(non-oriented electrical steel). The material 

of the billet is S45C(carbon steel). The numbers of elements 

and nodes are 107632 and 115101, respectively. The ampere 

turns of the coil are set as 70000AT (60Hz). The CPU time 

and number of iteration of the Fixed-Point method (FPM), 

modified Fixed-Point method (MFPM), Newton-Raphson 

method using ν-B 2 curve (NRM(B 2 )), and Newton-Raphson 

method using B-H curve (NRM(B)) are compared. For 

simplicity, only the calculation of the 1st step of the step by 

step method for the nonlinear eddy current analysis is carried 

out in order to compare the performance of each method. As 

the total CPU time is almost equal to the multiple of number 

of steps, the comparison of only the 1st step is sufficient for 

the comparison of each method. 

IV. RESULTS AND DISCUSSION 

Fig.4 shows an example of distribution of flux density of 

NRM(B) and MFPM. The results of NRM(B 2 ) and FPM are 

also the same as Fig.4. The comparison of the CPU time and 

the number of iterations are shown in Table I. The 

convergence property is shown in Fig.5. The convergence 

criterion is B(A) < 2.010 -3 . The convergence criterion 

( ) 

G n 

/ G 

( 0) 

of the ICCG method is chosen as less than 10 -5 . 

Intel Core2 Duo E8400@ 3.16GHz, 3GB RAM is used. These 

results suggest that the convergence property of MFPM is near 

to that of NRM. Especially, MFPM is faster than NRM (B). It 

is also clarified that NRM(B 2 ) is faster than NRM(B). 

fire-resistant material 

billet 

y 

150 

z 

x 

unit:mm 

200 

100 100 

 

yoke 

15 15 2510 2510 50 50 

adiabator 

billet 

iron core 

 

V. CONCLUSIONS 

z 

x 

unit:mm 

The obtained results can be summarized as follows: 

(a) The formulation of the modified Fixed-Point method 

y 

10 

200 

25 

200 

300 

15 

50 

(a) (b) 

Fig. 3 Analyzed model of billet heater. (a) bird’s eye biew (1/8 region). (b) xy 

plane. 


148 

TABLE I 

COMPARISON OF CPU TIME AND ITERATIONS 

Method CPU Time (sec) Iterations 

NRM(B2) 370.08 13 

NEM(B) 654.83 28 

FPM 3432.24 101 

MFPM 781.69 22 

PC performance : Intel Core2 Duo E8400@ 3.16GHz, 3GB RAM 

Flux density B[T] 

3.16 

3.24 

2.88 

2.52 

2.16 

1.80 

1.44 

1.08 

0.72 

0.36 

0.00 

y 

x 

(a) (b) 

Fig.4 Comparison of numerical results of Flux distribution using NRM(B) 

and MFPM. (a) NRM(B). (b) MFPM. 

Number of nonconverged 

elements 

 

 

 

 

 

 

Fig. 5 Convergence Property. 

NRM(B2 NRM(B ) 

NRM(B) 

FPM 

MFPM 

2 ) 

NRM(B) 

FPM 

MFPM 

 

 

Iterations 

(MFPM) using the derivative of reluctivity is almost the 

same as that of the Newton-Raphson method (NRM). 

(b) The modified Fixed-Point method (MFPM) has an 

advantage that the CPU time is less than that of the 

Newton-Raphson method (NRM) in some condition, or 

MFPM has almost the same performance as NRM. 

Moreover,the programming is easy compared with NRM. 

[1] 

REFERENCES 

F.I.Hantila, G.Preda, and M.Vasiliu : “Polarization method for static 

fields”, IEEE Trans. Magn., vol.36, no.4, pp.672-675, 2000. 

[2] M.Chiampi, D.Chiarabaglio, and M.Repetto: “A Jiles-Atherton and 

fixed-point combined technique for time periodic magnetic field 

problems with hysteresis” , IEEE Trans. Magn., vol.31, no.6, pp.4306- 

4311, 1995. 

[3] D.Miyagi, K.Shimomura, N. Takahashi, H. Kaimori: “Usefulness of 

fixed point method in electromagnetic field analysis in consideration of 

nonlinear magnetic anisotropy”, Digest of IEEE CEFC, 2012. 

[4] S.Urata, M.Enokizono, T.Todaka, and H.Shimoji: “Magnetic 

[5] 

characteristic analysis of the motor considering 2-D vector magnetic 

property”, IEEE Trans. Magn., vol.42, no.4, pp.615-618, 2006. 

K.Fujiwara, T.Adachi, and N.Takahashi: “A proposal of finite-element 

analysis considering two-dimension magnetic properties”, IEEE Trans. 

Magn., vol.38, no.2, pp.889-892, 2002. 

[6] T.Nakata, N.Takahashi, K.Fujiwara, and N.Okamoto: “Improvements of 

convergence characteristics of Newton-Raphson method for nonlinear 

4

magnetic field analysis”, IEEE Trans. Magn., vol.28, no.2, pp.1048- 

1051, 1992. 

[7] E.Dlala, A.Belahcen, and A.Arkkio : “Locally convergent fixed-point 

method for solving time-stepping nonlinear field problems”, IEEE Trans. 

Magn., vol.43, no.11, pp.3969-3975, 2007. 

[8] E.Dlala, A.Belahcen, and A.Arkkio : “A fast fixed-point method for 

solving magnetic field problems in media of hysteresis”, IEEE Trans. 

Magn., vol.44, no.6, pp. 1214 -1217, 2008. 

[9] N.Takahashi, S.Nakazaki, D.Miyagi, N.Uchida, K.Kawanaka, and 

H.Namba: “3-D optimal design of laminated yoke of billet heater for 

rolling wire rod using ON/OFF method”, Archives of Electrical 

Engineering, vol.61, no.1, pp.115-123, 2012. 


5


Software Agent Based Domain Decomposition Method 

1) M. Jüttner, 1) A. Buchau, 2) M. Rauscher, 1) W. M. Rucker, and 2) P. Göhner 

1) Institute for Theory of Electrical Engineering, Pfaffenwaldring 47, D-70569 Stuttgart, Germany 

2) Institute of Industrial Automation and Software Engineering, Pfaffenwaldring 47, D-70569 Stuttgart, Germany 

E-mail: ite@ite.uni-stuttgart.de 

Abstract—A workbench is described, able to divide complex coupled three dimensional multiphysics simulations into 

smaller parts based on existing domain decomposition techniques. These parts are calculated by software agents allowing to 

widely distributes the calculation over multiple distributed computers and even into the cloud to speed up the performance, 

to make larger simulations possible and to actively manipulate and control the strategy and the process of solving. 

Index Terms— cloud computing, coupled multiphysics problems, domain decomposition, software agents 

different resources like idle workstations, laptops or 

smartphones and even online resources located within the 

cloud. These resources can be used by established and 

multifunctional solving methods like FEM or BEM. FEM 

is able to solve non-linear and anisotropic material effects 

and lead to large systems of non-linear equations. An 

alternative to the FEM is the BEM. At BEM only the 

surface of a model needs to be discretised. This leads to a 

much smaller system of equations represented in a dense 

matrix. The calculation time for the matrix gets 

acceptable if we use matrix compression. Therefore the 

fast multipole method (FMM) and the adaptive cross 

approximation (ACA) can be used [7], allowing to 

calculate hysteresis effects for magnetic fields [8]. 


Finding a solution for complex three dimensional 

coupled field problems more efficiently is the goal of this 

approach. Therefore, established methods for numerical 

solutions like finite element methods (FEM) and 

boundary element methods (BEM) are combined with the 

idea of software agents. 

Software agents are a way to develop flexible and 

efficient software based on the concept of agent oriented 

software development [1]. Therefore, the system is 

divided into autonomous and self-organized agents. 

These agents are independent of each other and capable 

to make decisions within there possibilities. Therefore 

they are able to interact with each other via messages and 

data exchange. Based on this the agents negotiate to reach 

their individual goals. The communication between the 

agents also allows a dynamic handling of multiple 

situations per agent as well as for the global system to 

grant dynamic and well fitting agent behaviour. Within 

the context of agent based systems a systematic 

distribution of the functions, necessary for solving a 

problem, to different agents grant a limited coupling 

between different agents and results into an even more 

flexible and manageable system. This flexibility and 

dynamic allow the approach described in section II to 

perfectly handle systems with multiple boundary 

conditions and to solve weak coupled systems. The 

approach of software agents is currently well established 

in automation technology and used for example for selfmanagement 

in automation systems [2], modelling smart 

grids [3], prioritization of test cases [4] or optimising 

electromagnetic field problems [5]. 

Because of the big influence of available computer 

resources for solving numerical problems nowadays 

workstations including multiple multicore CPUs and a 

relatively large RAM could be used. To handle these 

resources modern programming languages are available 

and grant a quite good usage of all of these resources. 

Based on the increase of calculation power the problems 

getting larger and coupled effects are considered as well. 

Nowadays large simulation problems are mostly solved 

on huge computer clusters with identical computers. The 

usage of temporary available resources grant, due to 

modern operation systems a large performance alternative 

[6]. Products like the Microsoft High Performance 

Computing Server or the Enterprise Linux Cluster from 

Redhat or Suse offer a simple way to spread tasks to 

II. COUPLING SOFTWARE AGENTS AND SOLVER 

Considering a large coupled simulation, the creation of 

equations including all effects is mostly not reasonable. 

Splitting the problem into smaller parts that can be solved 

iteratively can reduce the total expense of the large nonlinear 

problem [9], especially when small changes in the 

partial problems can be ignored and do not lead to further 

iterations of the calculation. Therefore a so called 

coordination agent is created. This agent splits the 

coupled problem into partial problems. Examples for 

different classes of partial problems are different single 

physic problems as well as geometry or material based 

partial problems. All functions of the coordination agent 

are described in section II.B. Then, the partial problems 

are assigned to different calculation agents. Fig. 1 

describes the cooperation of software agents. 

Fig. 1: Concept of cooperation agents

Each calculation agent solves its problem with an 

optimized approach for its partial problem. Due to that a 

combination of multiple methods like FEM or BEM for 

different partial problems are possible. This allows a 

combination of the advantages of FEM and BEM for 

multiple different calculation resources. For the 

calculation agents there is no need to be within one 

system. Different resources can be used if the calculation 

agents are distributed to multiple computers or even the 

cloud as displayed in Fig. 2. The calculation agents are 

described in detail in section II.C. The collection of these 

software agents is able to solve coupled problems. The 

necessity of this new approach handling multiple physics 

and large systems can be seen in [10]. The simulation was 

only possible with height effort to reach convergence. 

 

Fig. 2: Distributed Agents 

A. Steps to a successful simulation 

Setting the approach into its context and describing the 

process of creating and solving a complex coupled 

problem with this approach is topic of this subsection. 

The process is visualized in Fig. 3. 

Build a finer 

mesh 

Modelling the system with a FEM-software 

Including mesh and boundary conditions 

Export mesh and boundary conditions 

Divide mesh into smaller parts 

Mapping the boundary conditions 

Mesh-management within the agent-system 

Calculation Agent 1 

Solve partial problem 

Parallel and independent 

Exchange of boundary 

conditions 

Exchange of status 

Exchange of results 

no 

status, 

boundary 

conditions 

convergence 

yes 

Combine solutions 

Calculation 

Agent 

2 

... 

conceivable 

Calculation 

Agent 

n 

Fig. 3: The process for a simulation 

- Initially a model estimating the actual problem 

needs to be created. This approach does not set any 

special requirements to the model itself. So the model can 

be created with commonly used CAD-software tools. The 

same holds for the creation of the mesh of the model, so 

common meshing-tools can be used. For complex 

coupled problems it is important to consider all effects 

within one single mesh because the calculation is 

influenced by the geometry of all physics as well as their 

coupling. 

- To allow any solver to create a suitable solution, the 

boundary conditions for the given mesh including all 

coupled physics has to be set. The boundary conditions as 

well as the previously created mesh have to be exported 

in a way it can be handed over to the coordination agent. 


- The mesh and the boundary conditions are now 

handed to a coordination agent. The coordination agent is 

responsible for finding a solution for the given problem. 

Therefore it splits the problem into smaller parts. The 

quantity of smaller parts depends on the number of 

available calculation agents or is defined manually. In 

case of a resource based splitting the assignment of the 

partial problems to the different calculation agents is 

intuitive. In case of a manual quantity of partial problems 

each partial problem is solved iteratively according to 

availability of resources. The partial problems are now 

distributed to the available calculation agents. 

- The calculation agents now start solving their 

allocated partial problem. Because of each calculation 

agent running on its own hardware the agent is able to use 

all resources of this hardware to solve its partial problem 

fast and highly parallel. If a new calculation agent, 

including new hardware resources, appears in the system 

the coordination agents has to determine if the resource 

should used and which agent splits its partial problem. In 

case of a drop out of a calculation agent the coordination 

agent needs to attach the partial problem to another 

calculation agent. This behaviour allows a dynamic 

adaption of the system. To do so, information based on 

status of different calculation agents has to be distributed. 

- To allow the system of agents to solve a coupled 

problem, the calculation agents has to exchange results 

between each other as soon as they are available. If a 

calculation agent is able to understand and interpret the 

results of another calculation agent, new boundary 

conditions can be derived and integrated into the own 

calculation. This process continues until all calculation 

agents finished. If some partial problems do not reach 

convergence the calculation can be interrupted and 

reinitialised with a new set of partial problems without 

recalculating successfully solved parts. 

- Finally the coordination agent combines all results 

depending on the way they were split before and returns 

the result to the user. 

An example describing the advantages of this approach 

is shown in section II.E. 

B. The Coordination Agent 

The coordination agent is an independent program with a 

small set of functions. It’s visible to the in- and outside 

and represents the interface between the users and the 

calculation agents. The graphical user interface (GUI) 

provides the interface to the user. The GUI is controls all 

functions described below. The internal interface is 

realised via a message system handling different types of 

messages received from other agents. In addition to that 

the coordination agent offers process variables like the 

convergence criteria and the overall progress, so each 

problem needs to be assigned to at least one coordination 

agent. The different functions of the coordination agent 

are summarized in Fig. 4 and described in detail in the 

following. 

- Via its GUI the coordination agent offers the 

interface to load a problem. A second problem can only 

be loaded, if the first is solved and the results are either 

collected by the user or the actual calculations are 

interrupted and possible results are dropped. The GUI is 

additionally used to display calculated results.

- Solving a model only gets possible if calculation 

agents are available. These agents need to be able to solve 

all different classes of the actual problem. Therefore the 

coordination agent collects and manages information 

about all available calculation agents and their 

possibilities to solve problems. In this context economical 

aspects can also be considered within the process of 

solving by the possibility to weight different agents. In 

case of multiple coordination agents working in the same 

surrounding it’s necessary to care about the status of 

agents to avoid multiple tasks for the same agent. 

- To instruct a calculation agent to solve a partial 

problem, the partial problem must be created. In case of a 

simple problem and a single calculation agent able to 

solve the problem, the partial problem can be the problem 

and can directly be handed over to the calculation agent. 

In all other cases the problem must be split. 

An obvious splitting for weak coupled systems is based 

on the different types of physics. Further opportunities 

for splitting results out of the method of BEM-FEM 

coupling (combining the positive effects on both methods 

by calculating non-linear equations with FEM while the 

linear once are calculated with BEM). An approach 

splitting the different physics and considering BEM/FEM 

coupling is realised for electromagnetic field problems in 

[11]. There it was shown that iterative coupling of BEM 

and FEM results in an increase of convergence compared 

to a strong coupling for the different physics. So this 

segmentation based on the different type of equations and 

physics is used. 

Another way to decompose different domains is based 

on regions solvable with the same numerical method. The 

regions are usually segmented by borders of the different 

materials. This is especially useful for distributed 

calculation and for different discretisation size within one 

model. The idea as well as a domain decomposition based 

on the number of available resources is realised for FEM 

in FETI [12] and for BEM in BETI [13]. 

Another idea is based on overlapping regions only 

considering Dirichlet boundary conditions [14]. This is of 

special interest, if we take a look at the amount of data 

exchanged between different agents. 

All mentioned methods for domain decomposition 

have in common, that the decomposition has to be done 

before the actual solving is done. A flexible or dynamic 

adaption to results or partial solutions gets possible with 

this agent based approach. This rapidly increases the 

speed of convergence for a complex simulation. So this 

approach gets more flexible, more dynamic and more 

adapted to available resources compared to existing 

domain decomposition algorithms. 

- In a next step the partial problems need to be 

distributed to the different calculation agents. Therefore 

the coordination agent reserves required calculation 

agents. Further it shares all necessary information 

including the actual partial problem and initialises the 

solving process. 

If a calculation agent finishes, a notification is received 

by the coordination agent. The coordination agent then 

updates its progress variables and checks, if all other 

agents working on the same problem have finished. In 


this case the overall solution is available. If other agents 

are still working, only a partial solution can be offered. In 

case of no solution can be found a creation of a finer 

mesh and an initialisation of the splitting process are done 

in hope to find solvable partial problems. The user finally 

gets informed about these circumstances. 

Fig. 4: The Coordination agent 

C. The Calculation Agent 

Calculation agents are independent programs. At their 

start up all necessary parameters are set independent from 

a usage by a coordination agent. The functions of the 

calculation agent are summarized in Fig. 5 and described 

in detail in the following. 

- Before a calculation agent can offer its service to a 

coordination agent, it has to complete its description. 

Therefore a unique name must be set for each calculation 

agent while it’s initialised. Also the status the agent is 

currently in and the problem classes the calculation agent 

is capable of solving needs to be specified as well. The 

problem classes the calculation agent is able to solve 

depend on the specific solver the calculation agent is 

connected to. To reach a flexible system and to allow 

calculation agents to connect to different solvers, a solver 

interface is created capable of handling all data and 

information connected to the solver. The solver interface 

is described in detail in section II.D. A important 

information is the commissioner of the actual tasks the 

calculation agent is working for. Therefore the name of 

the coordination agent is stored within each agent to send 

notifications to the commissioner if this gets necessary. 

- To manage the solver interface and to satisfy its 

needs is the major task for a running calculation agent. 

Therefore the agent provides all information requested by 

a solver and pass them onto the solver interface. 

Examples are the initial boundary conditions that are 

received from the coordination agent and the tolerances 

for the solver. 

The calculation agent also has to make sure, when ever 

another calculation agent reports an available result, the 

calculation agent has to check whether the result does 

influence the own calculation or not. In case of an 

influence, a re-initialisation of the solver process is 

necessary. This includes a stop of the actual solver 

process, an update of the boundary conditions after 

calculation agent has received the result from the other 

calculation agent and a start of the new initialized solver. 

In case of a successful calculated partial solution the 

calculation agent notify all calculation agents connected 

to the problem about this result and distribute the result 

about the new boundary conditions if they are requested. 

Also the coordination agent needs to be informed about

the successful calculation and the availability of the 

results. In case of a failure or a not converging solving 

algorithm chosen by the calculation agent, the 

coordination agent also has to be informed. The same 

holds for a drop of available solver resources. 

- Whenever a calculation agent is started a GUI 

provided by each calculation agent is displayed. This GUI 

allows setting the calculation agent parameters as well as 

connecting it to a solver interface includes setting 

necessary parameters therefor. Examples for these 

parameters are the host the solver is running on, the port 

this host allows to establish a connection and the problem 

classes that could be solved with the given solver. All 

functions the GUI provides are needed whenever the 

coordination agent instructs a calculation agent to solve a 

problem. Therefore the GUI provides functions like 

loading a model, starting the calculation, extracting the 

results and a possibility to cancel the actual solving 

process. These functions can also being used without a 

connection to a coordination agent. In this case the 

calculation agent solves simple problems on its own. 

- To track the actual process of solving a partial 

problem and to understand the behaviour of a calculation 

agent each calculation agent offers a separate function of 

writing a log file and displaying it within the GUI. 

To grant the availability for the coordination agent during 

the complete process of solving and to allow the 

calculation agent to react flexible to information from 

other agents at least two threads are created within the 

calculation agent. The first thread represents the 

functionality of the calculation agent. Further threads are 

used by the solver and its interface to solve the partial 

problem. Only the realisation with multiple threads 

allows the calculation agent to handle requests form the 

coordination agent as well as checking for changes in the 

boundary setting and interrupt in case of necessity while 

the solver is calculating a solution for the problem. A 

quick and efficient message exchange allows sending and 

receiving as well as processing messages with very little 

delays is another important part to grant the flexibility of 

the system. Consequences for the solver of received 

messages are handled by the solver interface. 

Fig. 5: The Calculation Agent 

D. The Solver Interface 

The solver interface is part of the calculation agent. It is 

the bridge between the calculation agent and a solver. It 

allows the calculation agent to solve at least on problem 

class. The functions of the solver interface are described 

in the following and summarize Fig. 6. 

- To establish a connection between the calculation 

agent and different solvers the solver interface can be 

understood as a collection of libraries controlling a 

variety of solvers. While starting the calculation agent the 

specific type of solver must be selected and parameters 


for the reachability of the solver must be set. Examples 

are the host name and the network port the solver can be 

reached or the local running solver application. In case of 

multiple solver interfaces managed by a single calculation 

agent it gets possible to create calculation agents with the 

possibility to solve multiclass problems. 

- If a connection to a solver is established the 

necessary parameters need to be set. Therefore the partial 

problem received by the calculation agent must be 

translated to a form the solver does understand and 

passed to the solver. 

- In the next step the solver interface starts the solver. 

- While the solver is calculating the major task of the 

solver interface is to control the solver and manage its 

output. This includes monitoring all information created 

by the solver as well as interrupting the solver for 

checking possible changes due to results of other agents. 

Additional information like the convergence behaviour of 

the actual partial problem, the availability of the solver, 

its resources and a guess for the remaining calculation 

time are collected by the solver interface and passed to 

the calculation agent. The analyses of the information 

allow a quick reaction from the calculation agent and also 

the coordination agent to any changes of the system. An 

example is a temporary unavailable solver. Due to that 

act the calculation agent gets temporary unable to solve 

problems and has to disconnect itself from the 

coordination agent and the problem. Then the 

coordination agent has to find an alternative for the 

calculation agent to successfully solve the problem. 

Another example concerns the possibility of convergence. 

If the calculation agent recognizes a convergence is 

unlikely, the calculation agent has to reconsider the 

chosen form of the solver or in the worst case a message 

has to be sent to the coordination agent to replace the 

actual splitting by a different on. 

- After a successful calculation the solver interface 

notifies the calculation agent to inform other agents about 

the available result. All information connected to the 

message exchange between different agents is handled by 

the calculation agent. The solver interface only takes care 

about the information directly connected to the solver. 

If the result is requested, the solver interface extracts the 

result and translates it into a form that can be shared with 

other agents. In case the result or the solver is no longer 

needed the solver interface detach the connection to the 

solver, release reserved resources and initialise the 

deregistering process of the calculation agent. 

Fig. 6: The solver interface 

E. Processing Details 

The way a problem is solved do significantly depend 

on the timing of the agents finishing their calculations 

and informing others agents about their results. So in 

coupled systems not every effect has the same meaning at 

each moment within the process of solving. The time to

calculate a partial result for two physics depends for an 

identic mesh on the linearity of the materials for the 

different physics as well as on the resources each agent is 

able to use. Therefore the dependent partial results are 

offered at a different time and the timing issue to the 

global system needs to be taken special care of. 

As an example the temperature at a circuit board after a 

certain time should be calculated like it’s shown in Fig. 7. 

The system consists of one coordination agent and two 

calculation agents. The first calculation agent calculates 

the electric field and as a side effect, the resistive losses. 

The second agent takes care of the calculation of the heat 

conduction from the transistor. In the described approach 

the recalculation of the overall temperature simulation 

will automatically be initialised if the result of the electric 

simulation including the resistive losses is present and do 

significantly change the result of the temperature 

simulation. Because of the parallel calculation of the 

circuit board it’s only necessary to recalculate some 

values of the matrix. Fig. 8 and Fig. 9 show two different 

scenarios for handling the different calculation times. 

Fig. 7: Coupled Problem 

In Fig. 8 a calculation procedure is assumed where the 

calculation agent responsible for solving the electric field 

problem has finished its calculation first. In that case the 

calculation agent responsible for the temperature has to 

check whether the heat radiated from the electric current 

does significantly change the own result or can be 

ignored. In the example the heat has to be considered. 

Therefore the temperature calculation agent has to update 

its calculation based on the result of the electric 

calculation agent. Therefore it adapts its boundary 

conditions to the result and recalculates again. If the 

temperature calculation agent also finishes and in case of 

no more calculation agents working on the problem the 


coordination agent requests all calculated partial results 

and combines them to a single result that is finally 

offered to the user. 

Fig. 8: Solution case I 

In Fig. 9 the opposite case is considered. The 

calculation agent responsible for solving the temperature 

problem finishes first. In this simplified case the 

temperature does not have any influence on the electric 

conductivity so the electric calculation agent responses a 

“not acknowledge” (NACK) to the given information 

about a result. This NACK means that there is no 

influence expected from the partial result of the 

temperature calculation agent to the electric calculation 

agent. Then the electric calculation agent passes on and 

finishes its calculation regularly. If the temperaturecalculation 

agent obtains the result of the electriccalculation 

agent it checks its result and recalculates it as 

in the previous case. The following continues equally. 

Fig. 9: Solution case II

III. IMPLEMENTATION ENVIRONMENT 

A. The Agent System 

The actual implementation is using the Java Agent 

Development Framework (JADE) as a middle-ware to 

implement the mentioned agents. JADE is a Java agent 

based development framework. It is distributed by 

Telecom Italia and currently available under Lesser 

General Public License Version 2 (LGPLv2) with the 

latest version 4.2.0 and a release on 26 th June 2012. The 

implementation in Java, grants a system and operating 

independent environment for usage. The minimal system 

requirement is a running Java 1.4 runtime-environment 

available for mostly every system and even smartphones. 

The agent communication is based on a protocol 

containing seven layers that ensure the correct 

transmission and reception of message from different 

types. The complete system is thereby based on a 

standard offered by the “Foundation for Intelligent 

Physical Agents” (FIPA) that was inherited by IEEE in 

2005 

B. The Solver 

Each calculation agent can connect itself to two different 

solvers. As a FEM-software, COMSOL Multiphysics 

[15] offers a bench of modern algorithms able to solve 

coupled problems. It also includes the creation of a mesh 

with multiple mesh types. In this context it is quite useful 

that all elements available in the software can be reached 

via the Java-API COMSOL Multiphysics offers. It is also 

possible to run the software as a server and connect to the 

server via an offered jar library for Java-programs. The 

library also passes results that can be visualised within a 

GUI. The prototype of the GUI for the calculation agent 

is based on the example for the usage of the COMSOL 

API. Within the API the GUI and the solver are already 

realised in parallel threats and notifications are send when 

a task starts or finishes. This helps initialising further 

events. Necessary functions to control the solver and its 

behaviour are also implemented. As a BEM-software the 

calculation agents are prepared to connect to FAMU [8]. 


This approach will efficiently solve highly complex 

three dimensional coupled field problems based on the 

idea of software agents spread onto multiple distributed 

computers including the cloud. The distributed computers 

run a so called calculation agent, able to solve smaller 

problems. The total expenditure for finding a solution is 

reduced by the creation of multiple smaller equation 

systems. The complex coupled problem is split by a 

variation of already established domain decomposition 

methods into these smaller problems. The domain 

decomposition used, is based on different physics and 

different material properties as well as geometrical 

aspects. Every partial problem is then assigned to a 

calculation agent and handled for its own. Finding a 

solution of a coupled problem only gets possible by the 

communication and negotiation between the different 

agents. The communication also allows to dynamically 

adapt the system to new surrounding and to find 

convergence in coupled systems 


An additional approach to reduce the effort for finding 

a solution is reached by the independent decision agents 

are allowed to take. This concerns especially the way the 

calculation agents solving the given partial problem. This 

includes the decision for a numerical method like FEM or 

BEM and the reaction on changed boundary conditions. 

Meaning, slightly modified boundary conditions are 

skipped for the partial result if no change in the result is 

expected instead of initialisation a new calculation cycle. 

The domain decomposition and the coordination of the 

interworking of different calculation agents are handled 

by a so called coordination agent. In this paper the 

realisation of a calculation agent as well as the realisation 

of a coordination agent with their different functions and 

their interworking is described. 

REFERENCES 

[1] N. Jennings, "Agent-Oriented Software Engineering," in Multi- 

Agent System Engineering, Berlin, Springer, 1999, pp. 1-7. 

[2] H. Mubarak and P. Göhner, "An agent-oriented approach for selfmanagement 

of industrial automation systems," 8th International 

Conference on Industrial Informatics, pp. 721-726, 2010. 

[3] M. Pipattanasomporn, H. Feroze and S. Rahman, "Multi-agent 

systems in a distributed smart grid: Design and implementation," 

Power Systems Conference and Exposition, pp. 1-8, 2009. 

[4] C. Malz, N. Jazdi and P. Göhner, "Prioritization of Test Cases 

Using Software Agents and Fuzzy Logic," 5th Conference on 

Software Testing, Verification and Validation, pp. 483-486, 2012. 

[5] D. G. Lymperopoulos, N. L. Tsitsas and D. I. Kaklamani, "A 

Distributed Intelligent Agent Platform for Genetic Optimization in 

CEM: Applications in a Quasi-Point Matching Method," 

Transactions on Antennas and Propagation, vol. 55, no. 3, pp. 

619-628, 2007. 

[6] A. Buchau, S. M. Tsafak, W. Hafla and W. M. Rucker, 

"Parallelization of a Fast Multipole Boundary Element Method 

with Cluster OpenMP," Transactions on Magnetics, vol. 44, no. 6, 

pp. 1338-1341, 2008. 

[7] A. Buchau, W. M. Rucker, O. Rain, V. Rischmuller, S. Kurz and 

S. Rjasanow, "Comparison between different approaches for fast 

and efficient 3-D BEM computations," Transactions on Magnetics, 

vol. 39, no. 3, pp. 1107- 1110, 2003. 

[8] A. Buchau, W. Hafla, F. Groh and W. M. Rucker, "Fast multipole 

method based solution of electrostatic and magnetostatic field 

problems," Computing and Visualization in Science, vol. 8, no. 3, 

pp. 137-144, 2005. 

[9] V. Rischmuller, S. Kurz and W. M. Rucker, "Parallelization of 

coupled differential and integral methods using domain 

decomposition," Transactions on Magnetics, vol. 38, no. 2, pp. 

981-984, 2002. 

[10] P. Alotto, M. Guarnieri and F. Moro, "A Fully Coupled Three- 

Dimensional Dynamic Model of Polymeric Membranes for Fuel 

Cells," Transactions on Magnetics, vol. 46, no. 8, pp. 3257-3260, 

2010. 

[11] J. Albert, R. Banucu, W. Hafla and W. M. Rucker, "Simulation 

Based Development of a Valve Actuator for Alternative Drives 

Using BEM-FEM Code," Transactions on Magnetics, vol. 45, no. 

3, pp. 1744-1777, 2009. 

[12] C. Farhat and F.-X. Roux, "A method of finite element tearing and 

interconnecting and its parallel solution algorithm," International 

Journal for Numerical Methods in Engineering, vol. 32, no. 6, pp. 

1205-1227, 1991. 

[13] U. Langer and O. Steinbach, "Boundary Element Tearing and 

Interconnecting Methods," Computing, vol. 71, no. 3, pp. 205-228, 

2003. 

[14] D. Lavers, I. Boglaev and V. Sirotkin, "Numerical solution of 

transient 2-D eddy current problem by domain decomposition 

algorithms," Transactions on Magnetics, vol. 32, no. 3, pp. 1413- 

1416, 1996. 

[15] COMSOL AB, Tegnérgatan 23, SE-111 40 Stockholm.








Human exposure to the magnetic field 

produced by MFDC spot welding systems 

D. Bavastro ∗ , A. Canova ∗ , L. Giaccone ∗ , M. Manca ∗ , M. Simioli † 

∗ Politecnico di Torino - Dipartimento Energia, C.so Duca degli abruzzi 24, 10129 Torino, Italy 

† KGR S.p.a, via Nicolao Cena 65, 10032 Brandizzo, Italy 

E-mail: luca.giaccone@polito.it 

Abstract—In this paper the magnetic field emission of Medium Frequency Direct Current (MFDC) spot 

welding system is analyzed with reference to the exposure of working population. In the first part of the 

paper experimental measurements have been carried out in order to get the magnetic field emission of 

a selected MFDC system. The measurement is performed in time domain acquiring the waveform of the 

magnetic field. In the second part of the paper the methodologies suggested by the ICNIRP guidelines have 

been adopted for the analysis of the waveforms: the equivalent frequency method, the multiple frequency method 

and the weighted multiple frequency method. An exhaustive comparison of the possible methodology suggested 

by the guidelines is given and contextualized in the regulatory framework. The emission of MFDC spot 

welding system has been completely characterized. By means of the spectral analysis it is found that the 

overcoming of the limit is mainly due to the 2000 Hz and 4000 Hz components. This result is useful for 

manufacturers because, in order to minimize the overall emission, it is possible to think about a mitigation 

system that encloses only the related internal components. 

Index Terms—pulsed magnetic fields, quasi-static magnetic fields, spot welding, MFDC 


Protection of the working population against 

the possible effects of extremely low frequency 

(ELF) electromagnetic fields is a concern of the 

European Community, which has published 2004 

Directive 2004/40/EC [1]. The Directive refers to 

the risk to the health and safety of workers due 

to known short-term adverse effects in the human 

body caused by the circulation of induced currents, 

by energy absorption, and contact currents. One of 

the most important points stated in the Directive is 

the rationale of exposure at low frequency which 

is defined in accordance with ICNIRP 1998 guidelines 

[2]. These Guidelines report that in the ELF 

frequency range, the risk to the health and safety of 

workers is due to known short-term adverse effects 

caused by the circulation of induced currents in the 

human body. 

INCIRP guidelines and directive 2004/40/EC 

provide a definition of reference or action values 

(i.e., values which can be directly measured like 

magnetic flux density) and exposure limit values 

(i.e., limit which are based directly on established 

health effects and biological considerations like 

current density). 

In this paper the exposure to the magnetic field 

produced by Medium Frequency Direct Current 

(MFCD) spot welding devices is analyzed. For this 

application the magnetic field waveform is pulsed 

and non-sinusoidal. While continuous wave mode 

of exposure is strictly defined in ICNIRP guidelines, 

the evaluation for pulsed or non-sinusoidal 

magnetic field waveforms is still an open question 

[3], [4], [5], [6], [7], [8], [9]. In the ICNIRP guidelines 

(year 1998), the problem of non-sinusoidal 

waveforms was tackled by means of superposition 

of harmonic values. This approach, even if possible, 

has been highly criticized afterwards because 

of an excessive conservative estimates of exposure 

levels. Due to the increasing importance of nonsinusoidal 

sources of magnetic fields, in 2003 

ICNIRP has published a new guideline for pulsed 

and complex non-sinusoidal waveforms [10]. This 

document is strongly based on the result obtained 

from K. Jokela [11]. It addresses the exposure evaluation 

in non-sinusoidal conditions by means of 

proper weighting factors to be applied to different 

harmonic components of the waveform spectrum. 

In 2010 the ICNIRP published a new set of 

guidelines [13]. There are two main differences 

between this document and the older one: 1) the 

dosimetric quantity used for ELF electromagnetic 

field is E (V/m) instead of J (A/m2 ). 2) The limits 

imposed on action values have been increased 

as can be observed in Fig. 1. 

From the analysis of the current literature, several 

papers analyze the same problem by computing 

the spectrum of the measured welding current.

Fig. 1. Reference levels for occupational exposure to time 

varying magnetic field. Comparison between 1998 and 2010 

values. 

Afterward, the welder is usually modeled as a 

coil and, simulations are performed in frequency 

domain for each spectral component of the current 

in order to derive the current density inside a 

human model or a simplified and standardized 

model [8], [9], [12]. Even if these kind of simulations 

are not an easy task, the procedure is often 

preferred because it is easier to measure the current 

in time domain rather than the magnetic field, 

excpecially for quasi-rectangular waveform [14]. 

The drawback of this procedure is that assuming 

a spectrum for the current, the generated magnetic 

field is characterized by the same spectrum in all 

the surrounding point due to the neglection of the 

nonlinear electrical devices inside the body of the 

welder. 

In this paper the different methodologies provided 

by the ICNIRP to analyze pulsed and nonsinusoidal 

magnetic field will be applied to the 

MFCD spot welding devices. The actual waveform 

of the magnetic field have been measured taking 

care to the possible measurement problem [14]. 

Finally, in order to compute the exposure level 

the limit provided by the ICNIRP 1998 has been 

employed due to the fact that currently the Italian 

regulation framework refers to those guidelines. 

II. MFDC SYSTEMS 

In Fig. 2 the conversion chain of the MFDC 

system is represented. The supply power, taken 

from the three-phase system at 50 Hz, is driven 

by means of a rectifier to an IGBT switch. The 

waveform at the input/output of the transformer is 

characterized by a frequency of 1000 Hz and, after 

a full-wave rectification (f = 2000 Hz), is applied 

to the welder terminals. The welder terminal can 

be considered as a R-L load. The switching of 

the IGBT bridge is controlled so that the welding 

current reaches a desired (constant) value. The 


welding process can be performed by means of 

a single impulse or with more the one impulse. 

In Fig. 3 the waveform of the weld current is 

shown. As it can be observed, the actual current 

is not perfectly rectangular because the conversion 

system is not able to nullify completely a ripple at 

2000 Hz (and higher harmonics) that is superposed 

to the weld current. The main weld parameter are: 

the current peak (Ip) that is usually in the order 

of some kA, the weld time that is the duration of 

the single pulse and the hold time that is a period 

when the current is zero after the welding, but the 

electrodes are still applied to the sheet to chill the 

weld. 

 

 

 

 

 

 

 

 

 

Fig. 2. MFDC spot welding device under analysis. The 

reference system for the measurement is placed in the center 

of the welding coil. 

current 

peak 

weld time hold time 

Fig. 3. Weld current and its main parameters. Weld time = 

140 ms, hold time = 40 ms. 

In the spot welding sector the MFDC welder are 

often preferred to the AC ones because of some 

benefit: a shorter weld times is required due to the 

the DC output current, hence significant energy 

saving is obtained. Moreover, MFDC systems are 

very stable in working condition far from the rating 

power (useful range: 20-95%). Conversely, AC 

systems are unstable and inefficient when used 

outside the 70-90% of the rating power. 

III. EXPERIMENTAL MEASUREMENT 

In this paper the magnetic field emission of 

a MFDC welder produced by KGR S.p.A. is 

analyzed. In Fig. 4 the layout with dimensions

is shown. The MWG model is a manual welder, 

therefore the operator is quite close to the device 

during the welding operation. In Fig. 5 a classical 

working configuration is reported in frontal and 

lateral view. 

Several observation points have been defined 

in order to evaluate the human exposure in the 

working configuration represented in Fig. 5. It 

has to be stressed that the considered working 

configuration is also the most critical one because 

the operator is faced to the welder coil. 

The measurements points are summarized in Table 

I. The coordinates are related to the reference 

system in Fig. 6. 

Fig. 4. Weld current and its main parameters. Weld time = 

140 ms, hold time = 40 ms. 

(a) (b) 

Fig. 5. working configuration: front view (a) and side view 

(b) 

TABLE I 

FIELD POINTS 

field point x (m) y (m) z (m) 

P1 0 0 0.28 

P2 0 0 0.5 

P3 0.5 0 0.28 

P4 0.5 0 0.5 

P5 0.8 0 0.28 

P6 0.8 0 0.5 

Al the measurement will be referred to the 

current represented in Fig. 3: current peak (Ip) 

equal to 12 kA, wled time equal to 140 ms and 

hold time equal to 40 ms. 

For each measurement point the waveform of 

the magnetic field has been measured along the 

three axis (x, y and z). Finally the rms values 


 

 

 

 

 

 

 

 

 

 

 

Fig. 6. Reference system for measurement points definition 

has been computed. The total field waveforms for 

each observation points are shown in Fig. 7. By 

comparing those waveforms it is possible to derive 

some considerations: 1) as expected, the maximum 

value is observed for the field points P1 and P2 

that are faced to the welder coil. 2) P1, P3 and P5 

present a higher peak values with respect to P2, 

P5 and P6 because they are closer to the welder. 

3) by defining two groups of points with the same 

distance from the welder [P1, P3, P5] and [P2, 

P4, P6] it is possible to note that the peak value 

decreases by moving far from the welder coil. 4) 

the decreasing low for the peak value is not true 

for the medium frequency ripple superposed to the 

waveforms. It seems that the observation points 

P3 and P4 are characterized by the higher ripple. 

This consideration will be better investigated in the 

following by analyzing the spectrum components 

of all the waveforms. At this stage, a qualitative 

explanation can be given considering that P3 and 

P4 are located in front of the full wave rectifier 

connected to the secondary winding of the medium 

frequency transformer. Hence, P3 and P4 are the 

points more influenced by the conversion system. 

IV. HUMAN EXPOSURE EVALUATION 

The magnetic field produced by MFDC spot 

welding system is a pulsed magnetic field (see 

Fig. 7). With reference to the ICNIRP guidelines 

[2], [10], [11], it can be analyzed with three different 

approaches: the equivalent frequency method, 

the multiple frequency method and the weighted 

multiple frequency method. 

A. Equivalent frequency method 

The equivalent frequency method is introduced 

with the note 4 of Table 4 and 6 in the ICNIRP 

guidelines [2] . Afterward, it is better detailed in 

the guidelines focused on pulsed magnetic fields 

[10]. The method simply takes into account the 

pulse duration (tp) and maximum value of the 

waveform during impulse. Finally it refers to the 

equivalent and continuous sinusoidal field with a 

frequency feq =1/tp in order to test compliance 

with the reference levels.

(a) 

(b) 

(c) 

Fig. 7. P1 and P2 (a) P3 and P4 (c) P5 and P6 

Fig. 8. Equivalent frequency method: application to the 

measurement point P1 

If a single pulse of the magnetic field measured 

in the observation point P1 is analyzed by means 

of the equivalent frequency method the result of 

Fig. 8 is obtained. The single pulse is characterized 


by tp = 140 ms that leads to a continuous sinusoidal 

field with frequency equal to feq =3.5 Hz. 

The amplitude of the sinusoidal field is imposed 

to the maximum valued observed during the single 

pulse, i.e. 6298.3 μT. 

It is now simple to identify the reference level 

for a 3.5 Hz sinusoidal field from Fig. 1, that is 

16327 μT. It must be stressed that reference level 

is a rms value, therefore it must be compared with 

6298.3/ √ 2 = 4453.5 μT. Finally, by means of the 

equivalent frequency method, the welder produces 

a magnetic field that is 3.5 times lower than the 

applicable reference level in the observation point 

with the higher emission. 

B. Multiple frequency method 

The multiple frequency method is suggested for 

non-coherent waveform, i.e. waveform that can not 

be measured with repeatability property because of 

their intrinsic variation in time. 

The procedure is summarized in the following 

steps: 

• selection of the signal to be analyzed 

• perform the Fourier Transform of the signal 

• computing the global indicator defined as: 

where: 

Is = 

65 kHz 

Bj 

BL,j 

j=1 Hz 

+ 

10MHz j>65 kHz 

Bj 

b 

(1) 

• Bj is the magnetic flux density at frequency 

j; 

• BL,j is the magnetic flux density reference 

level; 

• b is 30.7 μT (rms) for occupational exposure 

The exposure is compliant with the limit if Is < 1. 

In order to test the compliance with this method 

a single impulse in each observation point has 

been selected in order to perform the Fourier 

Transform. Finally, in order to better point out the 

rationale of this methodology, the Is representation 

is provided: 

• graphically: in the x-axis is represented 

the frequency and in the y-axis the ration 

Bj/BL,j. With this it is possible to understand 

what are the frequency components that 

exceed the relative limit. 

• numerically: computation of Is with (1) 

For the sake of brevity the graphical result are 

shown just for the measurement points P1 and P3. 

In Fig. 9 it is possible to see the spectrum of the P1 

and P3 waveforms. It is possible to observe that the 

harmonic content is mainly located in the range 0- 

100 Hz with significant values of DC component. 

The spectrum is also characterized from a 2000 Hz

(a)P1:x=0m,y=0m,z=0.28 m 

(b)P3:x=0.5m,y=0m,z=0.28 m 

Fig. 9. Spectral analysis of the measured waveform 

(a)P1:x=0m,y=0m,z=0.28 m 

(b)P3:x=0.5m,y=0m,z=0.28 m 

Fig. 10. Computation of the ICNIRP limit by means of the 

multiple frequency method 


component and its multiple. As explained before 

those components are generated from the final 

full wave rectifier. In fact, the observation points 

P3 that is faced to the rectifier, present in its 

spectrum the higher harmonic content for this set 

of frequencies. 

From the analysis of Fig. 10 it comes out that 

none of the observation points can be considered 

complaint if the multiple frequency method is 

adopted because the Is values exceed significantly 

the unitary reference level (see also Table II for the 

other measurement points). Remembering that the 

y-axis of Fig. 10 represents the summation terms 

of (1) it is possible to observe that the overcoming 

of the limit is mainly due to the 2000 Hz and 4000 

Hz components. For point P3 those harmonics are 

about 5 times higher than the relative reference 

level. It is worth noting that reference levels become 

stricter for higher frequencies. Moreover, 

above 820 Hz eq. (1) imposes a constant limit 

equal to 30.7 μT even if in Fig. 1 the trend is 

still decreasing above 65 kHz. 

Finally, it must be stressed that this procedure 

is based on the assumption that the spectral components 

add in phase, i.e., all maxima coincide at 

the same time and results in a sharp peak. This is 

a realistic assumption when the number of spectral 

components is limited and their phases are not 

coherent, i.e., they vary randomly. For fixed coherent 

phases the assumption may be unnecessarily 

conservative [2], [11], [13]. 

C. Weighted multiple frequency method 

The weighted multiple frequency method is proposed 

for coherent waveform, i.e. waveform that 

can be measured with repeatability property. For 

these signals the multiple frequency method always 

leads to a conservative exposure assessments. 

This modification provides an alternative method 

based on weighted peak that more closely recognizes 

the nature of biological interactions. 

The procedure is summarized in the following 

steps: 

• selection of the signal to be analyzed 

• perform the Fourier Transform of the signal 

• computing the global indicator defined as: 

 

 

 

 

 

 

Iw = max 

(WF) j 

Bj cos (2πfjt + θj + ϕj) 

 

j 

 

(2) 

where: 

• Bj is the amplitude of the j-th frequency 

component;

• WFj is the weighting function where the 

magnitude is equal with the inverse of the 

peak reference level at j-th frequency 

• θj is the phase of the i-th component of B 

• ϕj is the phases of the weighting function 

of the i-th component. It should satisfy the 

conditions: 

– ϕ(f) =π/2 if ffc 

• fc = 820 Hz for occupational exposure 

The exposure is compliant with the limit if Iw < 1. 

Conversely from the multiple frequency method 

here the spectrum of the waveform is weighted by 

a complex function. It is possible to observe that 

the ICNIRP definition of the weighting function 

for magnetic flux density waveform is a high-pass 

filter with cut-off frequency equal to fc = 820 Hz 

[10], [11]. 

In order to better point out the rationale of this 

methodology, the Iw representation is provided: 

• graphically: it is provided the 

graph of the function of time: 

 

j (WF) j Bj cos (2πfjt + θj + ϕj) 

• numerically: computation of Iw with (2) 

From the analysis of the results presented in 

Fig. 11 it is possible to observe that even applying 

the weighting methods none of the observation 

points can be considered compliant with 

the unitary reference level (see also Table II for 

the other measurement points). In spite of this 

result, the weighting procedure allows a significant 

reduction for some observation points (see P1 

and P3 values). This is in accordance to the fact 

that for pulsed fields associated to spot welding 

machine the multiple frequency method is a too 

conservative assessment procedure. 


In this paper the magnetic field generated from 

MFDC spot welding systems have been analyzed. 

All the possible methodology suggested in the 

international guidelines provided by the ICNIRP 

have been employed: the equivalent frequency 

method, the multiple frequency method and the 

weighted multiple frequency method. 

The main conclusion is that the three methodologies 

lead to contrasting result. By means of 

equivalent frequency method the emission is compliant 

with the reference level even in the maximum 

emission point (3.5 times lower than the 

limit). On the other hand, the other two approaches 

provide opposite results, i.e., none of the surrounding 

points is compliant with the limit. 

In Table II the results for weighted and nonweighted 

multiple frequency method are summarized. 

For most of the observation points, a high 


(a)P1:x=0m,y=0m,z=0.28 m 

(b)P3:x=0.5m,y=0m,z=0.28 m 

Fig. 11. Computation of the ICNIRP limit by means of the 

wighted multiple frequency method 

difference in the result is observed. In our opinion 

for the spot welding process the multiple frequency 

method is too conservative because the waveform 

is clearly coherent due to the supply control system. 

However, result for weighted multiple frequency 

method are still quite far from the unity 

limit. 

The definition of exposure limits for pulsed 

magnetic fields is still an open problem. Therefore 

no Standard and guidelines define precisely 

what is the constraint and the procedure to be 

observed for spot welding machine as well as other 

technologies characterized by similar emission of 

pulsed magnetic field (e.g MRI devices). One of 

the consequence is that the Directive 2004/40/EC 

has been modified extending the deadline for the 

application of the reference levels [15], [16] that 

now is fixed for October 31th 2013. Concerning 

the EU directives, it must be stressed that reference 

levels are always related to acute effects and not to 

possible long term effect. In the literature, as well 

as in the real life applications, there is no evidence 

of acute effects. For possible long term effects, the 

epidemiological studies recognizes that it is hard 

to estimate the exposure because may arise other 

type of exposure in the same workplace which are 

correlated to the same EMF exposure and which 

may affect the health of workers. An example 

concerns exposure to welding fumes which may

increase lung cancer risks among welders [17], 

[18]. 

TABLE II 

COMPARISON BETWEEN WEIGHTED AND NON-WEIGHTED 

MULTIPLE FREQUENCY METHOD 

field point Is Iw 

P1 99.68 16.77 

P2 13.60 4.23 

P3 125.28 36.41 

P4 18.23 7.51 

P5 23.64 7.80 

P6 6.38 4.61 

From the technical point of view it is quite impossible 

to apply a mitigation system to the welder 

coil for (obvious) operating reason. The analysis of 

Table II together with Fig. 9 allows to understand 

that the overcoming of the limit is mainly due to 

the 2000 Hz and 4000 Hz components. For point 

P3 those harmonics are about 5 times higher than 

the relative reference level. Therefore, the future 

development of this work is to design a shielding 

case for the transformer and the full wave rectifier. 

Obviously the mitigation system will not reduce 

the maximum value of the magnetic field (that is 

generated from the welder coil). The aim is just to 

reduce as much as possible the medium frequency 

components that seem to play a significant role in 

equations (1) and (2). 

REFERENCES 

[1] Directive of the european parliament and of the council 

of 29 april 2004 on the minimum health and safety 

requirements regarding the exposure of workers to the 

risks arising from physical agents (electromagnetic fields) 

european parliament and council. 

[2] ICNIRP. Guidelines for limiting exposure to time varying 

electric, magnetic and electromagnetic fields (up to 300 

GHz). Health Phys, 74(4):494–522, 1998. 

[3] H. Heinrich. Assessment of non-sinusoidal, pulsed or 

intermittent exposure to low frequency electric and magnetic 

fields. Health Phys, 96(6):541–546, 2007. 

[4] R. Scorretti, N. Burais, A. Fabregue, and O. Nicolas, and 

L. Nicolas. Computation of the induced current density 

into the human body due to relative LF magnetic field 

generated by realistic devices. IEEE Transactions on 

Magnetics, 40(2):643–646, 2004. 

[5] D. Desideri and A. Maschio. Magnetic field emissions 

up to 400 kHz from a welding equipment. In Proc. 

Int. Symp. Electromagnetic Compatibility, pages 151– 

156, Barcelona, 2006. 

[6] D. Desideri, A. Maschio, and P. Mattavelli. Human 

exposure topulsed current waveforms below 100 kHz. 

In 391-396, editor, Proc. Int. Symp. Electromagnetic 

Compatibility, Hamburg, Sep. 8–12, 2008. 

[7] G. Kang and O. Gandhi. Comparison of various safety 

guidelines for electronic article surveillance devices with 

pulsed magnetic fields. IEEE Transactions on Biomedical 

Engineering, 50(1), 2003. 

[8] A. Canova, F. Freschi, and M. Repetto. Evaluation of 

workers exposure to magnetic fields. The European 

Physical Journal Applied Physics, 52(2), 2010. 


[9] A. Canova, F. Freschi, L. Giaccone, and M. Repetto. Exposure 

of working population to pulsed magnetic fields. 

IEEE Transaction on Magnetics, 46(8):2819–2822, 2010. 

[10] ICNIRP. Guidance on determining compliance of exposure 

to pulsed and complex non-sinusoidal waveform 

below 100 kHz with icnirp guidelines. Health Phys, 

84(3):383–387, 2003. 

[11] K. Jokela. Restricting exposure to pulsed and broadband 

magnetic fields. Health Phys, 79(4):373–388, 2000. 

[12] F. Dughiero, M. Forzan, and E. Sieni. A numerical 

evaluation on electromagnetic fields exposure on real 

human body models until 100 khz. COMPEL, 29:1552– 

1561, 2010. 

[13] ICNIRP. Guidelines for limiting exposure to time-varying 

electric and magnetic fields (1 Hz to 100 kHz). Health 

Phys, 99(6):818–836, 2010. 

[14] G. Crotti and D. Giordano. Problems in the detection 

of quasi-rectangular magnetic flux density waveforms. In 

18th Symposium IMEKO TC4, Natal (Brasil), September 

2001. 

[15] Directive of the european parliament and of the council of 

23 april 2008 amending directive 2004/40/ec on minimum 

health and safety requirements regarding the exposure 

of workers to the risks arising from physical agents 

(electromagnetic fields) (18th individual directive within 

the meaning of article 16(1) of directive 89/391/eec). 

[16] Directive of the european parliament and of the council of 

19 april 2012 amending directive 2004/40/ec on minimum 

health and safety requirements regarding the exposure 

of workers to the risks arising from physical agents 

(electromagnetic fields) (18th individual directive within 

the meaning of article 16(1) of directive 89/391/eec). 

[17] R.M. Sterns. Cancer incidence among welders: possible 

effects of exposure to extremely low frequency 

electromagnetic radiation (elf) and to welding fumes. 

Environmental Health Perspectives, 76:221–229, 1987. 

[18] Review of the scientific evidence for limiting exposure to 

electromagnetic fields (0-300 ghz). Technical Report Vol. 

15 N.3, National Radiological Protection Board, 2004.


A Circuital Approach for Eddy Currents Fast 

Evaluation in Beam-like Structures 

A. Formisano 

Dipartimento di Ingegneria Industriale e dell’Informazione 

Seconda Università di Napoli, via Roma 29, I-81031 Aversa (CE), Italy 

E-mail: Alessandro.Formisano@unina2.it 

Abstract — The electromagnetic analysis of mechanic or civil structures composed by an interconnection of beam-like 

elements, mechanically interconnected to create a structural mesh, can be formulated in terms of an equivalent lumped 

elements electric network. This is the case of the eddy currents evaluation in a truss bridge or a bridge crane exposed to a 

time varying electromagnetic field or in the reinforcement of buildings concrete. In such cases, if compatible with the 

accuracy needs, the network approach can be very effective thanks to the strong reduction of the model complexity. The 

paper proposes such a kind of formulation, based on concept of partial inductance. Advantage is taken from automated treebuilding 

algorithms for electric networks, and on minimum order formulations based on loop currents to further reduce 

computational complexity. 

Index Terms— Eddy Currents, Electric Circuit Theory, Filamentary Structures 


The use of metallic materials in mechanical and civil 

engineering is a common practice, due to the extremely 

favorable behavior of such materials with respect to 

stresses and mechanical solicitations. On the other hand, 

when exposed to time varying electromagnetic fields, 

metallic structures react by generating a field due to 

induced currents in their volume. The effect of such fields 

may reveal critical in some particular applications, such 

as when forces on the structures must be taken under 

control, or when aging phenomena are facilitated by 

electric currents in the metal, or when the electromagnetic 

field map must be strictly controlled in critical regions 

not far from the structures (e.g. to reduce interference on 

electronic devices or to avoid impact on physical 

phenomena requiring controlled field maps, such as 

Nuclear Magnetic Resonance, or finally to limit human 

exposure to electromagnetic energy). 

In these cases, the possible interactions with 

surrounding electromagnetic field sources must be 

considered in the design phase. Unfortunately, fully 3D 

numerical analysis would usually be required, since no 

particular symmetry or simplification can be expected to 

reduce model complexity. On the other hand, such a 

model would require a quite large computational effort, 

although just an estimate of the electromagnetic effects 

due to the mechanical structure are often enough in the 

design step. 

In the particular cases when the mechanical structures 

can be modeled using interconnects of beam-like 

elements (e.g. truss bridges, bridge cranes, iron rebar in 

reinforced concrete, etc.), the electromagnetic analysis, in 

the magneto-quasi-static limit and assuming linear 

behavior of all the materials, can be formulated in terms 

of an equivalent electric network, composed by lumped 

elements. The current in each branch of the network is 

related to the current density in the corresponding beamlike 

element of the structure thanks to a filamentary 

current element approximation of the actual beam. Each 

current element, or current stick, is defined by stick tips, 

and a (scalar) stick currents. 

The interconnection of sticks is defined through a 

suitable incidence matrix, defined from the actual 3D 

topology. The lumped network is created by associating 

to each single beam of the original structure a lumped 

parameter model, falling in the typical circuit classes. It 

follows that a resistive parameter has to be used to model 

the Ohmic behavior of the metallic element. In addition, a 

set of inductances should be used to represent the 

induction phenomena among sticks and their capability to 

accumulate magnetic energy. Finally, an electromotive 

force (typical of the voltage sources in the circuits) can be 

used to represent the induction phenomena from assigned 

external currents. 

In principle, also capacitive parameters should be 

considered to take into account the capability to 

accumulate electric energy, but in the range of 

frequencies here considered the impact of capacitive 

phenomena can be neglected. 

The mathematical tools able to face such a class of 

systems are the well known Kirchhoff laws, replacing the 

most general Maxwell ones, significantly reducing the 

complexity of the model. 

Although the lumped network approach to treat similar 

structures is quite diffused in the electromagnetic analysis 

of mechanical structures [1-4], the network analysis is 

usually performed using standard computer codes, not 

necessarily guaranteeing the minimum computational 

effort. In this paper, an automated fundamental loop 

method is proposed to achieve a minimum complexity 

resolution of the network. 

Once currents in each branch are known, simple closed 

formulas can be used to estimate the field produced by 

each stick, forces on the structure elements, and Ohmic 

losses due to induction phenomena. This approach is 

particularly useful when a quick, yet not accurate 

estimation of the impact of metallic structures, is

equired. 

II. MATHEMATICAL MODELING 

Let's consider a structure composed of Nb metallic 

beams, connected in a general way at their tips in Nn 

nodes. 

The equivalent lumped network will be composed of 

Nb branches, with the same topology as the mechanical 

structure. According to the geometrical characteristics of 

the beam and, in addition, to the electromagnetic 

characteristics of the materials, each branches is endowed 

with a suitable set of circuit elements, including a 

resistance, an inductance, some mutual inductances, and a 

suited number of voltage sources. 

A very simple example is reported in Fig. 1(a), while 

in Fig. 1(b) the equivalent electric network is sketched. 

Since the assembly is immersed in a time varying field, 

we will assume that each stick, arbitrarily oriented, 

carries a current ik, k=1, 2...Nb, that represents the 

unknown to be determined. The currents are induced by 

an external field, but their value depends also on the 

structure itself, through material properties and 

geometrical relationships. 

voltage 

source 

+ - 

(a) 

branch 

resistance 

branch 

current ik 

(b) 

self inductance 

and mutual with 

all other branches 

Figure 1: (a) An example of mechanical interconnect of beam like 

metallic elements (the structure of a truss bridge); (b) its representation 

as an electric network. 

Each stick can be characterized by a resistance Rk. 

depending from its resistivity k, length Lk and cross 

section Sk. If assuming that the penetration depth at the 

highest frequency involved is smaller than the transverse 

dimension of the beam modeled by stick, and that both 


resistivity and cross section are uniform along the beam, 

the resistance associated the k-th stick can be computed 

as [5]: 

Rk=k*Lk/Sk 

Of course, if any of the above exposed hypotheses 

falls, the more general expression using the line integral 

along the beam axis of k(l)/Sk(l) can be used. 

The resistances are then assembled into a diagonal 

resistance matrix R. 

In addition, if assuming a linear magnetic behavior, the 

sticks assembly is characterized also by an inductance 

matrix Mb, whose elements describe the mutual 

inductance between sticks or, on the diagonal, their self 

inductance. Under the same assumptions used for (1) 

about skin depth, the self inductance Mkk of the k-th stick 

can be computed using [6]: 

4 2Lk 

Mkk 210 Lkln 

1 r 

 

Lk 

where r is the geometric mean distance and is the 

arithmetic mean distance on the corresponding k-th beam 

cross section. (2) provides self inductance in Henry is Lk 

is in meters. 

The mutual inductance Mjk between the j-th and k-th 

stick can be computed using formulas from [6]; as a 

possible alternative, assuming a limited dimension of the 

cross section, the mutual inductance can be evaluated also 

by line integrating (numerically) the vector potential 

Ak(x) of stick k on the axis of stick j: 

ˆ 

M A x tˆdl 

jk k j dl j 

j 

where j is the centerline along the j-th beam, xj is a 

generic point along j, and ˆt is the centerline tangent unit 

vector. The following expression has been used in this 

study for Ak [5]: 

j 

where a, b and c are defined in Fig. 2, â is the unit 

vector along the k-th stick, and suitable countermeasures 

have been taken to avoid singularities when sticks are 

aligned [7]. 

(1) 

(2) 

(3) 

I 

k 0 ˆ 

cba A x a ln 

4 

 

cba 

(4) 

 

c 

Ak b 

xj 

Figure 2: Basic elements form computation of vector potential using 

(3). 

Note that eq. (3) can be easily generalized to massive 

a

conductors, if the thin beam approximation may reveal 

too crude for the analysis, while this is not the case for 

closed form expressions found in [6]. 

The structure is supposed to be immersed in the timevarying 

magnetic field produced by another set of Ne 

external currents ie, linked with the sticks by means of a 

mutual inductance matrix Me. 

Elements of Me can be easily evaluated using 

expressions based on (3). For the particular shape of field 

source, suitable analytical (possibly approximate) 

expression are available; e.g., the mutual inductance of a 

stick and a power line can be easily computed using 

formulas from [6]. Of course, the more general procedure 

based on suitable decomposition in elementary sticks and 

a numerical evaluation of (3) can be used. 

If assuming that external sources are given (because 

not influenced by eddy currents induced in the structure 

or some other factor) in each branch, the induced emf can 

be circuitally described as a voltage source. The set of the 

voltages is given by 

e= Me die/dt + d Me /dt ie 

where the last term vanishes in case of time invariance of 

the matrix Me. 

Within these hypotheses, the system can be regarded as 

a R-L circuit, where sticks play the role of branches and 

Nn nodes represent the stick tips. 

The network topology is described by the incidence 

matrix (Nn rows and Nb column) providing, for each 

branch, the couple of starting-ending nodes. The 

incidence matrix can be easily recovered from CAD 

schemes for the mechanical assembly, where available, or 

by survey of the drawings. 

It is well known that the rank of incidence matrix is 

lower than the number of nodes and its value, for a 

connected network is Nn–1; consequently, often the 

“reduced” incidence matrix A is used, as it will be done in 

the following. The graph theory guarantees that 

independent columns in an incidence matrix do not form 

loops. It follows that a basis of the columns set defines a 

set of branches able to connect all the nodes of the 

network, i.e. a tree of the network 

A fast and effective way to search for independent 

columns is to determine the echelon form of A [8] and 

select the branch corresponding to the leading 

coefficients Of course, in general several trees can be 

defined for an assigned network; the choice of the 

extracted tree among all the possible ones can be 

controlled by ordering the columns of the incidence 

matrix in such a way that column corresponding to 

favourite branches are the leftmost ones. 

In order to simplify a number of automatic treatment of 

the network topology, it is recommended to rearrange the 

branch numbering of the matrix in such a way to include 

in the first NT positions (Nn-1 in case of connected 

networks), the columns of the tree branches. Then, the 

incidence matrix A can be partitioned as: 

(5) 

A =(AT; AC) (6) 


where, AT is the NTxNT non singular matrix corresponding 

to tree branches, and AC the NTxNb matrix corresponding 

to co-tree branches. 

Several effective methods can be used to face with the 

analysis of this network. 

One of the most effective and popular is the nodal 

technique whose unknowns are the nodes potential set vn. 

Here, for simplicity just the formulation in case of linear, 

memory free, voltage controlled components is 

highlighted: 

Yn vn = Jn 

where Yn and Jn are the nodal matrix and the nodal drive 

equivalent current vector, respectively. Both can be easily 

evaluated by the branch parameters. In particular 

Yn = A Yb A T , where Yb is the NbxNb matrix with the 

conductances (self or mutual) of the branches. The 

method can be extended to circuits with linear current 

controlled components or linear dynamical components 

and, in addition, also in presence of non linear 

components. Of course, in any case, the existence and the 

uniqueness of solution has to be assessed. 

Here the classical dual formulation is proposed, whose 

unknowns are the principal loop currents IL [9]. The 

model, for general dynamic systems, can be stated in time 

domain; here, taking advantage from linearity, the more 

compact formulation in the Laplace space is used: 

M 

s s s 

L M 

(7) 

Z I E (8) 

where IL and EM are the arrays of the principal loop 

currents and driven voltages, respectively, and s is the 

complex frequency. 

In (8) for simplicity, the hypotheses of linear, voltage 

controlled components has been assumed. 

T 

The loop impedance matrix BZ 

M b B 

Z can be 

deduced from the branch impedance matrix Zb = (Rb+sMb) 

and from the topological NLxNb matrix B of the principal 

loops related to the tree [8], where NL = Nb-NT is the 

maximum number of independent loops. Similar 

expressions hold for loop voltage sources EM, driven by 

external currents. 

It should be noticed that the partitioned form of B 

B = B ; 1 includes an identity matrix for the co-tree 

 

T L 

columns. In addition, the first partition B can be easily 

T 

deduced by the reduced incidence matrix: 

T -1 

T T L 

B =- A A 

(9) 

Once loop currents are known, currents in each branch 

are easily computed as I = B IL. 

From branch currents, estimates of the other electrical 

quantities can be easily recovered using closed form 

expression for stick currents. This is the case, for 

example, of the flux density produced in any points of

space [5], or the total Ohmic power, or, finally, the net 

force acting on the structure. 

III. NUMERICAL EXAMPLES 

In this section, firstly a simple example is discussed to 

illustrate the various steps of the proposed method; then a 

more complex case is presented to show the effectiveness 

of the approach. 

1. As a first example, the field produced by a reinforced 

concrete beam near a power line is considered. The 

beam is 2.4 m long, with a transverse dimension of 30 

cm, a reinforcement diameter of 2 cm, a resistivity of 

5x10 -5 m and is 5 m away from the line. The line 

carries 100 A of current at 50 Hz, and is assumed 20 

m long. 

The networks has 16 nodes; the 15 tree branches are 

branches 1-15 in Fig. 3, and the non-trivial part of the 

fundamental loop matrix BT is reported in table I. 

The currents in each branch can easily be computed 

using a linear system with 13 equations, and then 

post-processed to obtain estimates of the desired 

quantities. 

The highest currents are in the “longitudinal” 

branches #17 and #19 (1.4 mA), and #21 and #23 (1.5 

mA). 

A 2D FEM model, neglecting connecting elements in 

the mesh, and correcting conductivity to take into 

account the finite length of actual geometry, provides 

a current of 1.5 mA in the four “longitudinal” beams. 

The "disturbance" magnetic field produced by the 

beam is 1.72 nT at a point 10 cm away from the 

power line. Note that in the 2D FEM model, the 

reinforcement contribution was hidden by the 

numerical errors. 

2. The second example compares computational 

complexity in the case of a fully 3D geometry either 

using a commercial FEM package (COMSOL 

Multiphysics Ver. 4.2a, [10]) or using the proposed 

approach. 

The aim is to estimate total Ohmic losses in the 

metallic structure of the truss beam depicted in Fig. 1. 

The bridge is 16 m long, 4 m large and 4 m high. 

Each beam is a square with a 0.4 m side. 

The bridge is made of non magnetic structural steel, 

with a conductivity of 4x10 6 S/m. 

The excitation field is provided by a circular coil 

radius 5 m, hanging 10 m above the bridge. Of course 

such excitation is not realistic, but has been adopted 

for its ease of modelling with FEM package. 

The FEM model is solved with a mesh composed of 

34,000 2 nd order tetrahedral elements, giving 

300,000 unknowns, and requires 165 s to be solved 

on a i7-based PC, with 4GB RAM. 

The total Ohmic losses are 4 mW using FEM model 

and 3.5 mW using the proposed method. A map of 

losses density is reported in Fig.4. 


0.3 m 

Figure 3: Graph of the equivalent network for case 1: capital letters 

indicate nodes, number indicate branches, dashed lines are co-tree 

branches 

Loops 

C 

3 

2 

B 

1 

16 

A 

19 

D 

18 

2.4 m 

G 

7 

6 

4 

F 

5 

17 

23 

H 

22 

20 

E 

TABLE I 

NON TRIVIAL PART OF THE FUNDAMENTAL LOOP MATRIX 

Branches 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

1 -1 -1 -1 0 0 0 0 0 0 0 0 0 0 0 0 

2 -1 -1 -1 -1 1 1 1 0 0 0 0 0 0 0 0 

3 0 -1 -1 -1 0 1 1 0 0 0 0 0 0 0 0 

4 0 0 -1 -1 0 0 1 0 0 0 0 0 0 0 0 

5 0 0 0 0 -1 -1 -1 0 0 0 0 0 0 0 0 

6 0 0 0 0 -1 -1 -1 -1 1 1 1 0 0 0 0 

7 0 0 0 0 0 -1 -1 -1 0 1 1 0 0 0 0 

8 0 0 0 0 0 0 -1 -1 0 0 1 0 0 0 0 

9 0 0 0 0 0 0 0 0 -1 -1 -1 0 0 0 0 

10 0 0 0 0 0 0 0 0 -1 -1 -1 -1 1 1 1 

11 0 0 0 0 0 0 0 0 0 -1 -1 -1 0 1 1 

12 0 0 0 0 0 0 0 0 0 0 -1 -1 0 0 1 

13 0 0 0 0 0 0 0 0 0 0 0 0 -1 -1 -1 

Figure 4: Ohmic Losses density plot for FEM solution of example 2. 

IV. CONCLUSIONS 

A prompt method to assess the effect of mechanical 

structures composed by interconnected conducting 

beams, exposed to low frequency magnetic fields, has 

been presented. The method is based on equivalent 

electric network analysis, and adopts a minimum order 

formulation to solve the network. 

ACKNOWLEDGEMENTS 

11 

L 26 

O 

15 

14 

P 

12 

N 28 

13 M 

The author wishes to thank prof. R. Martone for 

fruitful discussions and for his support. 

This work has been partly supported by Seconda 

Università di Napoli, Italy, under PRIST grant 

“Generazione distribuita di energia da fonti tradizionali e 

8 

21 

K 

10 

J 

9 

27 

24 

I 

25

innovabili: aspetti ingegneristici e giuridici-economiciambientali” 

REFERENCES 

[1] A. Ruehli, "Equivalent Circuit Models for Three-Dimensional 

Multiconductor Systems", IEEE Trans. on Microwave Th. and 

Tech., vol. MTT-22, pp. 216-221, 1974. 

[2] W. Pinello, A. Ruehli, “Time Domain Solutions for Coupled 

Problems using PEEC Models with Waveform Relaxation”, 

Proceedings of Antennas and Propagation Society International 

Symposium AP-S. Digest, pp. 2118-2121, 1996. 

[3] A.Y. Wu, K.S. Sun, “Formulation and implementation of the 

current filament method for the analysis of current diffusion and 

heating in railguns and homopolar generators”, IEEE Trans. on. 

Mag., vol. 25, pp. 610-615, 1989. 

[4] B. Azzerboni, E. Cardelli, M. Raugi, ”Network mesh model for 

flux compression generators analysis, IEEE Trans. on Magn., vol. 

27, pp. 3951-3954, 1991. 

[5] H. A. Haus, J. R. Melcher, Electromagnetic Fields and Energy, 

Englewood Cliffs, NJ: Prentice Hall, 1989. 

[6] F. Grover, Inductance Calculation, New York: Van Nostrand, 

1946. 

[7] J. D. Hanson, S. P. Hirshman, “Compact expressions for the Biot– 

Savart fields of a filamentary segment”, Physics of Plasmas, vol. 

9, pp. 4410-4412, 2002. 

[8] L. Chua, I. Lin, Computer-Aided Analysis of Electronic Circuits, 

3rd ed., vol. 2. Oxford: Clarendon Press, 1982. 

[9] J. Nilsson, S. Riedel. Electric Circuits, Englewood Cliffs, NJ: 

Prentice Hall, 2010. 

[10] www.Comsol.com, last visited on Sept., 4 th 2012. 



Effectiveness of the Preconditioned 

MRTR Method Supported by Eisenstat’s Technique 

in Real Symmetric Sparse Matrices 

*Yoshifumi Okamoto, *Tomonori Tsuburaya, † Koji Fujiwara, and *Shuji Sato 

*Department of Electrical and Electronic Systems Engineering, Utsunomiya University 

7-1-2 Yoto, Utsunomiya, Tochigi 321-8585, Japan 

† Department of Electrical Engineering, Doshisha University 

1-3 Tataramiyakodani, Kyotanabe, Kyoto 610-0321, Japan 

E-mail: okamotoy@cc.utsunomiya-u.ac.jp 

Abstract—The Incomplete Cholesky Conjugate Gradient (ICCG) method is widely used to solve the indefinite algebraic 

equations obtained from the edge-based finite element method. However, when a linear solver based on the minimum 

residual without residual oscillations is used, there is a possibility of the elapsed time being shortened. This paper shows the 

effectiveness of the preconditioned Minimized Residual method based on the Three-term Recurrence formula of the CG-type 

(MRTR) method with Eisenstat’s technique by making a comparison with ICCG method for real symmetric sparse matrices. 

Index Terms— Eisenstat’s technique, ICCG method, MRTR method, split preconditioning. 

A x b, 

(1) 


where A is a large sparse n-by-n matrix, x is a solution n- 

The ICCG method [1] is widely used as a solver for a vector, and b is a n-vector. Now, suppose that the 

real symmetric indefinite linear equation derived from the diagonal scaling has already been applied to (1). 

edge-based finite element method. While the behavior of The recurrence formula for x in MRTR method is 

a residual in CG iterations is oscillatory, a monotonic designed using the expression 

decrease in the residual is mathematically ensured in 

MRTR method [2], which is an algorithm identical to that 

of Orthomin(2) [3] and which is based on the minimum 

residual. Therefore, there is a possibility that MRTR can 

solve linear equations faster than the CG method. 

IC factorization is widely recognized as a powerful 

preconditioner for a symmetric linear system. However, it 

may not necessarily be powerful in various magnetic 

field problems. Some of the split preconditioners such as 

symmetric Gauss-Seidel (SGS) and diagonal IC 

factorization (DIC) might be capable of improving the 

convergence characteristics of linear solvers. 

SGS and DIC preconditioners, in which the offdiagonal 

components in the original linear system are 

directly used in the preconditioned matrix, can utilize the 

Eisenstat’s technique [4], in which the matrix-vector 

product can be replaced by forward and backward 

substitutions. Therefore, there is a possibility that SGS 

and DIC preconditioners supported by Eisenstat’s 

technique can reduce the elapsed time by reducing the 

computational cost of an iterative process. 

This paper shows the effectiveness of preconditioned 

MRTR method supported by Eisenstat’s technique for the 

xk 1 x0 

zk 

1, 

zk 

1 

K k ( A; 

r0 

) , 

(2) 

where xk+1 is the solution vector in step (k + 1) in the 

iterative process and Kk(A;r0) is the Krylov subspace 

spanned by A and the initial residual vector r0. The 

residual vector rk+1 is comprehended in Kk+1(A;r0) as 

follows: 

rk 1 

b Axk 1 

r0 

Azk 

1 

K k 1( 

A; 

r0 

) . (3) 

Furthermore, the approximate solution (x0 + z) in step (k 

+ 1) satisfies the minimization condition as follows: 

min || b A( 

x0 

z) 

|| 2 min || r0 

Az 

|| 2 , (4) 

zS 

k 

zS 

k 

where Sk is a subspace comprehended by Kk(A;r0). Using 

a three-term recurrence formula involving Lanczos 

polynomials, the algorithm of MRTR method is given as 

follows: 

Algorithm 1 (MRTR method). Let x0 be an initial guess, 

and put r0 = b – Ax0. Set y0 = – r0 and 0 = (y0, y0). 

For k = 0, 1, 2, …, repeat the following steps until the 

condition ||rk||2/||b||2 < MR holds: 

( 

Ark 

, rk 

) / ( Ark 

, Ark 

) 

( k 0) 

 

k ( Ark 

, r ) 

k 

k 

, 

( k 1) 

 

 

k ( Ark 

, Ark 

) ( yk 

, Ark 

)( Ark 

, yk 

) 

several linear systems derived from the edge-based finite 

element method in the magnetic field analysis. 

Comparisons have been made with other well-known 

preconditioned CG methods. 

0 

 

( yk 

, Ark 

)( Ark 

, r ) 

k 

k 

 

 

k ( Ark 

, Ark 

) ( yk 

, Ark 

)( Ark 

, yk 

) 

k 1 

k ( Ark , rk 

), 

( k 0) 

, 

( k 1) 

II. PRECONDITIONED MRTR METHOD 

A. MRTR method 

The symmetric sparse linear system can be defined as 

follows: 

k 1 

pk rk 

k 

pk 

1, 

k 

xk 1 

xk 

k pk 

, 

yk 1 

 

k yk 

k Ark 

,

k 1 

rk 

yk 

1, 

where (u, v) denotes the inner product of vectors u and v. 

The above algorithm is mathematically equivalent to the 

conjugate residual (CR) method [5]. 

TABLE I shows the computational cost of MRTR 

method, along with a comparison with the CG method. 

Au, (u, v), (u + v), and u denote a matrix-vector 

product, the inner product, the sum of vectors, and a 

scalar-vector product, respectively. The computational 

cost of MRTR method is nearly identical to that of the 

CG method owing to the same number of computations 

for the matrix-vector product. 

TABLE I 

COMPUTATIONAL COST OF LINEAR SOLVERS 

linear solver Au (u, v) u + v u 

CG 1 2 3 3 

MRTR 1 4 4 4 

B. Preconditioning 

MRTR method can be combined with split 

preconditioning techniques as long as the preconditioner 

M, which can be written in the form M = CC T (with C : a 

lower triangular matrix), is used. The preconditioned 

T 

matrix Aˆ 

1 

 

C AC retains the symmetry of A. Here, 

we utilize M and C derived using shifted IC factorization 

[6], DIC [1], and SGS preconditioning [7]. 

IC preconditioner 

IC factorization is performed as follows: 

ˆ ˆ ˆ T 

, ˆ ˆ 1/ 

2 

M LDL 

C LD 

, 

(5) 

i1 

 

2 

 

 

aii 

 

li 

k d k k ( i j), 

k 1 

lij j1 

(6) 

aij 

 

 

li 

kl 

j kd 

k k ( i j), 

k 1 

dii 1 / lii 

, 

(7) 

where lij and dii are components of Lˆ and Dˆ and is 

the shifted parameter. is determined by performing the 

following steps: 1. Set = 1.05. 2. Perform IC 

factorization. 3. If all diagonal components become 

positive, shifted IC factorization is stopped. Otherwise, 

return to step 1, add 0.05 to , and iterate steps 1-3. 

If (5) is used for forward and backward substitutions, 

there is a possibility of cache miss in backward 

substitution. Hence, M is modified as follows: 

ˆ ˆ ˆ 1 

( ) ( ˆ ˆ T 

M LD 

D LD) 

. 

(8) 

Therefore, the process of forward and backward 

substitutions to compute the unknown vector u can be 

described as follows: 

ˆ ˆ ˆ 1 ( ) ( ˆ ˆ T 

L D D LD) 

u v, 

(9) 

where v is a known vector and the diagonal components 

of LDˆ ˆ become 1.0. Forward and backward substitutions 

is performed by following a two-step procedure 

consisting of 

ˆ ˆ 

ˆ 1 

( ) , 

( ˆ ˆ T 

LD 

y v y D LD) 

u, 

(10) 


ˆ ˆ T 

( LD) 

u Dˆ 

y. 

(11) 

Consequently, the computational cost can be reduced by 

using the forward substitution (10) and backward 

T 

substitution (11) instead of the expression M Lˆ 

Dˆ 

Lˆ 

. 

DIC preconditioner 

The large sparse matrix A can be split into three terms 

as follows: 

T 

A L I L , 

(12) 

where L is the strictly lower triangular part of A and I is a 

unit matrix. The diagonal matrix Dˆ obtained using 

shifted IC factorization (see (6) and (7)) is utilized. Thus, 

M can be defined as follows: 

ˆ ˆ 1 

ˆ T 

( ) ( ) , ( ˆ ) ˆ 1/ 

2 

M L D D L D C L D D . (13) 

Similar to the case of the IC preconditioner, the 

procedure for forward and backward substitution should 

be attentively schemed. The forward and backward 

substitution for the DIC preconditioner is designed to 

make the diagonal component 1.0: 

ˆ 1 ( ) ˆ ( ˆ 1 

T 

L D I D LD 

I) 

u v, 

(14) 

ˆ 1 

( ) , ˆ ( ˆ 1 

T 

LD I y v y D LD 

I ) u, 

(15) 

( LDˆ 

1 

T 

I ) u Dˆ 

1 

y. 

(16) 

SGS preconditioner 

Using (12), M for the SGS preconditioner can be 

defined as follows: 

T 

M ( L I ) ( L I ) , C L I. 

(17) 

Then, the algorithm of the preconditioned CG method is 

as follows: 

Algorithm 2 (Preconditioned CG method). Let x0 be M –1 

b, and put r0 = b – Ax0. Set p0 = M –1 r0 and u0 = p0. 


condition ||rk||2/||b||2 < CG holds: 

Apk 

, 

( rk , uk 

) / ( pk 

, ), 

x x 

p , 

k 

k 1 

k k k 

rk rk 

 

k, 

1 

uk 

M r 

, 

1 

1 k 1 

k ( rk 1, 

uk 

1) 

/ ( rk 

, uk 

), 

pk 1 

uk 

1 

k pk 

. 

The algorithm of preconditioned MRTR method [8] is as 

follows: 

Algorithm 3 (Preconditioned MRTR method). Let x0 be 

M –1 b, and put r0 = b – Ax0. Set u0 = M –1 r0, y0 = – r0, and 

z0 = M –1 y0. 



1 

AM rk 

Auk 

, 

w 

1 

1 

1 

M AM rk 

M 

, 

( 

w, 

rk 

) / ( , 

w) 

 

 

k ( w, 

r ) 

k 

k 

 

 

k ( , 

w) 

( yk 

, w)( 

w, 

y 

k 

) 

( k 0) 

, 

( k 1)

0 

 

( yk 

, w)( 

w, 

r ) 

k 

k 

 

 

k ( , 

w) 

( yk 

, w)( 

w, 

y 

( w, 

r ), 

k 1 

k k 

 

p 

k 1 

k uk 

k pk 

1 

k 

x x p 

k 1 

k k k 

yk 1 

 

k yk 

k, 

rk 

1 

rk 

yk 

1, 

zk 1 

 

k zk 

k w, 

u u z 

k 1 

k k 1. 

, 

, 

k 

) 

( k 0) 

, 

( k 1) 

C. Eisenstat’s technique 

Here, Eisenstat’s approach, in which the preconditioned 

matrix and vectors are mainly utilized, is applied to the 

preconditioned linear solvers in order to reduce the 

computational cost for the matrix-vector product. 

First, we apply Eisenstat’s technique to the DIC 

preconditioner using the expression ˆ 1 

( ) ˆ 1/ 

2 

C LD 

I D , 

and the preconditioned matrix-vector product Apˆ ˆ 

k can be 

transformed into 

Aˆ 

pˆ 

k 

ˆ 1/ 

2 ˆ 1 

1 

T 

D ( LD 

I) 

( L I L ) 

ˆ 1 

T 

ˆ 1/ 

2 

( LD 

I) 

D pˆ 

k 

ˆ 1/ 

2 

( ˆ 1 

1 

) {( ˆ 1 

D LD 

I LD 

I) 

Dˆ 

( 2Dˆ 

I) 

ˆ ( ˆ 1 

T 

) }( ˆ 1 

T 

) ˆ 1/ 

2 

D LD 

I LD 

I D pˆ 

k 

ˆ 1/ 

2 

( ˆ 1 

T 

) ˆ 1/ 

2 ˆ ˆ 1/ 

2 

( ˆ 1 

1 

D LD 

I D pk 

D LD 

I) 

{ ˆ 1/ 

2 ˆ ( 2 ˆ )( ˆ 1 

T 

) ˆ 1/ 

2 

D p 

ˆ 

k D I LD 

I D pk 

} 

ˆ 1/ 

2 T 

1 

ˆ { ˆ 1/ 

2 ˆ ( 2 ˆ T 

D C p 

) ˆ 

k C D pk 

D I C pk 

} . 

(18) 

It is shown that Apˆ ˆ 

k can be replaced by one backward 

substitution 

T 

C pˆ k and one forward substitution 

1 

ˆ 1/ 

2 

{ ˆ ( 2 ˆ T 

C D p ) ˆ 

k D I C pk 

} . On the other hand, the 

formulation of Apˆ ˆ 

k with the SGS preconditioner is as 

follows: 

Aˆ 

pˆ 

k 

1 

T 

T 

( L I) 

( L I L ) ( L I) 

pˆ 

k 

1 

T 

T 

( L I) 

{( L I) 

I ( L I) 

} ( L I) 

pˆ 

k 

(19) 

T 

1 

T 

( L 

I) 

pˆ 

( ) { ˆ ( ) ˆ 

k L I pk 

L I pk 

} 

T 

C pˆ 

1 

C ( pˆ 

T 

C pˆ 

) . 

k 

k 

The applicable scope of Eisenstat’s technique is restricted 

to the preconditioned matrix, in which the lower 

triangular part of the original equation is used as it is. The 

preconditioned CG method supported by Eisenstat’s 

technique is as follows: 

Algorithm 4 (Preconditioned CG method supported by 

Eisenstat’s technique). Set 

 

( LDˆ 

C 

L I 

1 

I) 

Dˆ 

1/ 

2 

k 

( DIC) 

. 

( SGS) 

ˆ0 0 

Let x0 be M –1 1 

b, and put r0 = b – Ax0. Set r C r , 

pˆ rˆ 

. 

0 

0 



condition ||rk||2/||b||2 < CG holds: 

T 

u C pˆ 

k , 

ˆ 1/ 

2 1 

{ ˆ 1 

ˆ ˆ 

 

 

D u C D 

A pk 

1 

u C ( pˆ 

k u) 

( ˆ , ˆ ) / ( ˆ , ˆ ˆ 

k rk 

rk 

pk 

Apk 

), 

xk 1 

xk 

 

ku, 

r rˆ 

 

Ap ˆ ˆ , 

ˆk 1 

k k k 

r ˆ k 1 C rk 

1, 

( ˆ , ˆ ) / ( ˆ , ˆ 

k rk 1 

rk 

1 

rk 

rk 

pˆ ˆ ˆ 

k 1 

rk 

1 

k pk 

. 

/ 2 

), 

pˆ 

( 2Dˆ 

I) 

u} 

k 

( DIC) 

, 

( SGS) 

Preconditioned MRTR method supported by Eisenstat’s 

technique is as follows: 

Algorithm 5 (Preconditioned MRTR method supported 

by Eisenstat’s technique). Set 

 

( LDˆ 

C 

L I 

1 

I) 

Dˆ 

1/ 

2 

( DIC) 

. 

( SGS) 

Let x0 be M –1 1 

b and put r0 = b – Ax0. Set rˆ 

0 C r0 

, 

yˆ ˆ 0 r0 

. 



T 

u C rˆ 

k , 

ˆ 1/ 

2 1 ˆ 1/ 

2 

{ ˆ ( 2 ˆ 

ˆ 

) } 

ˆ 

 

 

D u C D rk 

D I u 

Ark 

1 

u C ( rˆ 

k u) 

( 

Aˆ 

rˆ 

ˆ ˆ ˆ ˆ ˆ 

k , rk 

) / ( Ark 

, Ark 

) 

 

 

( ˆ 

k ˆ , ˆ 

k Ark 

rk 

) 

 

( ˆ ˆ , ˆ ˆ ) ( ˆ , ˆ ˆ )( ˆ 

 

A A A Aˆ 

, ˆ 

k rk 

rk 

yk 

rk 

rk 

yk 

) 

0 

 

 

ˆ ˆ 

( yˆ 

ˆ ˆ ˆ 

k , Ark 

)( Ark 

, r ) 

k 

k 

( ˆ ˆ , ˆ ˆ ) ( ˆ , ˆ ˆ )( ˆ 

 

A A A Aˆ 

, ˆ 

k rk 

rk 

yk 

rk 

rk 

y 

( ˆ ˆ , ˆ 

k 1 

k Ark rk 

), 

k 1 

pk u k 

pk 

1, 

 

x x p 

k 1 

k k k 

y 

yˆ 

Ar ˆ ˆ , 

ˆ k 1 

k k k k 

rˆ 

ˆ ˆ 

k 1 

rk 

yk 

1, 

r ˆ k 1 C rk 

1. 

k 

, 

k 

) 

( DIC) 

, 

( SGS) 

( k 0) 

, 

( k 1) 

( k 0) 

( k 1) 

In the linear solver using Eisenstat’s technique, the lower 

triangular matrix-vector product C r ˆk 1 

is computed to 

evaluate the residual rk+1. 

D. Computational cost of preconditioned linear solvers 

TABLE II shows the computational cost of 

preconditioned linear solvers. Au, Lu, L -1 u, and L -T u 

denote the matrix-vector product, the lower triangular 

matrix-vector product, forward substitution, and 

backward substitution, respectively. The abbreviations 

EDIC and ESGS represent the DIC and SGS 

,

preconditioners using Eisenstat’s technique, respectively. 

The computational cost of the preconditioned linear 

solvers using Eisenstat’s technique (EDIC and ESGS) is 

lower than that of the other preconditioned solvers by 

10 %. The reason why the computational cost does not 

reduce significantly when Eisenstat’s technique is used is 

the additional computation of the lower triangular matrixvector 

product Cr ˆk 1 

, whose cost is approximately equal 

to that of forward or backward substitution. 

TABLE II 

COMPUTATIONAL COST OF PRECONDITIONED LINEAR SOLVERS 

linear 

solver precond. Au Lu L-1u L -T u 

app. costs per one ite. 

(Au + Lu + L -1 u + L -T u) 

IC 1 0 1 1 1.0 

DIC 1 0 1 1 1.0 

CG EDIC 0 1 1 1 0.9 

SGS 1 0 1 1 1.0 

ESGS 0 1 1 1 0.9 

MRTR 

IC 1 0 1 1 1.0 

DIC 1 0 1 1 1.0 

EDIC 0 1 1 1 0.9 

SGS 1 0 1 1 1.0 

ESGS 0 1 1 1 0.9 

approximate computational costs: 

Au = 0.4, Lu = 0.3, L -1 u = 0.3, L -T u = 0.3 

III. ANALYSIS MODEL 

Figure 1 shows finite element meshes of model 

problems used for performing a magnetic field analysis. 

The unknown numbers in all meshes are determined by 

the absolute edge number based on the nodal number. 

Figure 1 (a) shows a box shield model [9] in which the 

magnetic shielding part is composed of four-layer finite 

elements in the thickness direction; the shielding 

thickness is 1 mm. Magnetostatic and eddy current 

analyses are carried out by considering the magnetic 

nonlinearity of SS400. 

Figures 1 (b) and (c) show the permanent-magnet-type 

MRI model [10]. For this model, magnetostatic field 

analysis is performed by considering the magnetic 

nonlinearity of the pole piece, yoke, and props. The 2ndorder 

hexahedral elements are of the Serendipity type. 

Finally, Figure 1 (d) shows the IPM motor (D-model) 

proposed by the IEEJ committee. A strongly coupled 

analysis is performed between the magnetic field and 

AC-driven three-phase circuit. The stator and overhung 

rotor are considered to be magnetically nonlinear, and the 

conductivity of the magnet is set to be 6.944 × 10 5 S/m. 

The number of revolutions per minute is set to 1500, and 

the pitch of the mechanical angle is 1°. The total number 

of time steps is set to 360. 


TABLE III lists the analyzed conditions. The Newton- 

Raphson (NR) method, along with the line search 

technique based on functional minimization (0, 1.0) [11], 

is used as the nonlinear analysis method. GEAR’s 

implicit scheme [12], [13] of 2nd order is used for the 

discretization of the time domain based on the A- 

formulation. 

IV. NUMERICAL RESULTS 

A. Verification of computational accuracy 

The computational accuracy of preconditioned MRTR 

method is verified for the box shield model. Figure 2 

shows the analysis results. The magnetic flux density Bz 

in the z-direction on z-axis is shown in Figure 2 (a). The 

characteristic of the ICCG method coincide with that of 

ESGS-MRTR. The relative error for two characteristics is 

less than 10 -4 % at points A and B. Similarly, the relative 

error for eddy current loss PJe as shown in Figure 2 (b) is 

less than 10 -4 % at these points. It is to be noted that other 

preconditioned linear solvers have similar characteristics. 

20 

240 

y 

75 

20d 

y 

(unit:mm) magnet: Br = 1.2 T 

z 

y z 

yoke: SS400 

z 

16 

100 

(unit: mm) 

coil:2 kAT 

magnetic shielding 

(SS400) 

x 

(a) Box shield model 

prop: 

SS400 

x gradientcoil 

polepiece: 

SS400 

13 

(unit:mm) magnet: Br = 1.2 T 

z 

240 20 

y 

75 

yoke: SS400 

13 

prop: SS400 

x 

gradientcoil 

polepiece 

: SS400 

(b) MRI (1st order tetra.) (c) MRI (2nd order hexa.) 

(unit : mm) 

45 

32.5 

TABLE III 

ANALYZED CONDITIONS 

x 

v 

w 

u 

v 

w 

u 

rotor core 

(50A350) 

stator core 

(50A350) 

30 

shaft 

(S45C) 

(d) IPM motor 

Figure 1: Finite element meshes. 

analysis model formul. discret. no. of nodes no. of elements DoF nonlinear circuit field 

box shield 

A 

A 

1st-hexa 72,900 67,980 

magnet 

enlarged view 

CG, MR || B 

|| 2 

197,472 static 10-3 10-2 

206,427 time domain 10-3 10-2 

MRI 

A 1st-tetra 279,090 49,813 323,965 static 10-3 10-3 A 2nd-hexa 93,879 87,120 1,014,600 static 10-3 10-3 IPM motor 1st-hexa 381,197 352,980 1,030,156 time domain 10-3 10-2 

 

A

B z [T] 

ESGS-MRTR 

0.025 

0.020 

0.015 

0.010 

0.005 

ICCG 

point A 

point B 

0 0.05 0.10 0.15 0.20 

z [m] 

2.0 

1.5 

1.0 

P Je [W] ESGS-MRTR 

0.5 

0 0.01 0.02 0.03 0.04 

t [s] 

point B 

point A 

ICCG 

(a) (b) 

Figure 2: Some characteristics of the box shield model: 

(a) the Bz distribution in the z-axis direction and (b) the 

distribution of eddy current loss. 

log 10(||r (k) || 2 / ||b|| 2) 

log 10(||r (k) || 2 / ||b|| 2) 

1 

0 DIC-CG 

-1 

-2 

EDIC-CG 

DIC-MRTR 

EDIC-MRTR 

SGS-CG 

-3 

ICCG 

ESGS-CG 

-4 

IC-MRTR 

-5 

SGS-MRTR 

-6 ESGS-MRTR 

-7 

0 200 400 600 

iteration number k 

800 1000 

1 

0 

-1 

-2 

-3 

(a) 

DIC-CG 

EDIC-CG 

DIC-MRTR 

EDIC-MRTR 

SGS-CG 

ESGS-CG 

-4 IC-MRTR 

-5 

SGS-MRTR 

ESGS-MRTR 

-6 

ICCG 

-7 

0 150 300 450 600 


750 900 

(b) 

Figure 3: Convergence characteristics of preconditioned 

linear solvers for the box shield model. (a) Magnetostatic 

field analysis and (b) eddy current analysis. 

B. Convergence characteristics and elapsed time 

Figure 3 shows the convergence characteristics of the 

box shield model, obtained by the magnetostatic and 

eddy current analyses. The characteristics are normalized 

by the initial norm of the residual in the 1st NR iteration. 

In MRTR method, the monotonic decrease in the residual 

has been mathematically proved; nevertheless, there are 

some noise spikes in the characteristics in the case of 

preconditioned MRTR method. The generation of noise 

is likely to be caused by changes in the NR iteration. 

Noise generation is also observed for the preconditioned 

CG method. The characteristics of preconditioned MRTR 

method are superior to those of the preconditioned CG 

method because the monotonic decrease in the residual is 

mathematically guaranteed in the former method. The 

characteristics of the ESGS and EDIC preconditioners 


linear 

solver 

CG 

MRTR 

linear 

solver 

CG 

MRTR 

TABLE IV 

ANALYSIS RESULTS FOR BOX SHIELD MODEL 

(a) MAGNETOSTATIC FIELD ANALYSIS 

precond. total linear ite. NR ite. time for precond. [s] elapsed time [s] 

IC 544 (1.00) 5 1.07 8.2 (1.00) 

DIC 979 (1.80) 5 1.06 13.8 (1.68) 

EDIC 979 (1.80) 5 1.07 13.0 (1.59) 

SGS 653 (1.20) 5 0.03 8.3 (1.01) 

ESGS 653 (1.20) 5 0.03 7.9 (0.96) 

IC 448 (0.82) 5 1.01 7.4 (0.90) 

DIC 812 (1.49) 5 1.09 12.4 (1.51) 

EDIC 812 (1.49) 5 1.06 11.5 (1.40) 

SGS 552 (1.01) 5 0.03 7.7 (0.94) 

ESGS 552 (1.01) 5 0.03 7.0 (0.85) 

(b) EDDY CURRENT ANALYSIS IN TIME DOMAIN (1ST TIME STEP) 


IC 503 (1.00) 4 1.07 8.6 (1.00) 

DIC 844 (1.68) 4 1.05 13.6 (1.58) 

EDIC 844 (1.68) 4 1.06 12.8 (1.49) 

SGS 595 (1.18) 4 0.03 8.7 (1.01) 

ESGS 595 (1.18) 4 0.03 8.2 (0.95) 

IC 393 (0.78) 4 1.00 7.4 (0.86) 

DIC 694 (1.38) 4 1.07 12.1 (1.41) 

EDIC 694 (1.38) 4 1.03 11.2 (1.30) 

SGS 470 (0.93) 4 0.03 7.4 (0.86) 

ESGS 470 (0.93) 4 0.03 6.8 (0.79) 

are consistent with those of SGS and DIC, respectively, 

and the characteristics of the DIC preconditioner are 

inferior to those of other preconditioned solvers. The IC- 

MRTR characteristics are the best among all 

preconditioned solvers. However, the elapsed time of 

ESGS-MRTR is the shortest among all solvers, as can be 

seen in TABLE IV. The reason for this is the reduction in 

the computational cost when Eisenstat’s technique is 

used. All results are obtained by using a PC (CPU: Intel 

Core i7 2600K/4.2 GHz; memory: 16 GB). Following all 

problems are solved with the same hardware. 

Figure 4 shows the convergence characteristics of MRI 

models. The convergence characteristics of 

preconditioned MRTR are superior to those of the 

preconditioned CG method. The DIC preconditioner is 

not very effective in improving the convergence 

characteristics. While the IC preconditioner is successful 

in the case of a tetrahedron, the SGS preconditioner is the 

most effective for a 2nd-order hexahedron. The 

effectiveness of the preconditioner depends on the target 

problem. TABLE V shows the analysis results for the 

MRI model. The elapsed time of ESGS-MRTR is the 

shortest among all preconditioned solvers. 

TABLE VI shows the analysis results for an IPM 

motor. The number of NR iterations is different for all 

solvers owing to the slight discrepancy in the converged 

solution in every time step. The elapsed time of ESGS- 

MRTR is the shortest among all linear solvers. 

V. CONCLUSION 

This paper shows the suitability of preconditioned 

MRTR method for solving an algebraic equation derived 

from the edge-based finite element method in a magnetic 

field. There is a possibility of reducing the elapsed time 

in the case of MRTR method by using the symmetric

Gauss-Seidel preconditioner supported by Eisenstat’s 

technique. 

log 10(||r (k) || 2 / ||b|| 2) 

log 10(||r (k) || 2 / ||b|| 2) 

1 

0 

-1 

DIC-CG 

EDIC-CG DIC-MRTR 

EDIC-MRTR 

-2 

ICCG 

-3 

SGS-CG 

-4 

-5 

IC-MRTR 

ESGS-CG 

-6 SGS-MRTR 

-7 

-8 

ESGS-MRTR 

-9 

0 200 400 600 800 


1000 

1 

0 

-1 

-2 

-3 

(a) 

ICCG 

SGS-CG 

ESGS-CG 

DIC-MRTR 

EDIC-MRTR 

DIC-CG 

-4 

EDIC-CG 

-5 

IC-MRTR 

-6 SGS-MRTR 

-7 ESGS-MRTR 

-8 

0 900 1800 2700 3600 


4500 

(b) 

Figure 4: Convergence characteristics of preconditioned 

linear solvers for the MRI model. (a) Tetrahedron and (b) 

2nd-order hexahedron. 

linear 

solver 

CG 

MRTR 

linear 

solver 

CG 

MRTR 

TABLE V 

ANALYSIS RESULTS FOR THE MRI MODEL 

(a) 1ST ORDER TETRAHEDRON 


IC 572 (1.00) 7 0.71 11.5 (1.00) 

DIC 958 (1.67) 7 0.73 18.6 (1.62) 

EDIC 958 (1.67) 7 0.72 17.6 (1.53) 

SGS 649 (1.14) 7 0.03 11.9 (1.03) 

ESGS 649 (1.14) 7 0.03 11.3 (0.98) 

IC 481 (0.84) 7 0.73 10.8 (0.94) 

DIC 777 (1.36) 7 0.72 16.7 (1.45) 

EDIC 777 (1.36) 7 0.74 15.3 (1.33) 

SGS 542 (0.95) 7 0.03 11.0 (0.96) 

ESGS 542 (0.95) 7 0.03 10.1 (0.88) 

(b) 2ND ORDER HEXAHEDRON 


IC 3,795 (1.00) 8 25.6 570.7 (1.00) 

DIC 4,305 (1.13) 8 25.2 643.5 (1.13) 

EDIC 4,219 (1.11) 8 25.3 585.4 (1.03) 

SGS 2,590 (0.68) 8 0.50 370.1 (0.65) 

ESGS 2,590 (0.68) 8 0.50 342.8 (0.60) 

IC 2,785 (0.73) 8 23.9 443.3 (0.78) 

DIC 3,357 (0.88) 8 25.7 530.9 (0.93) 

EDIC 3,357 (0.88) 8 23.3 470.3 (0.82) 

SGS 2,112 (0.56) 8 0.50 316.1 (0.55) 

ESGS 2,112 (0.56) 8 0.50 287.6 (0.50) 


linear 

solver 

CG 

MRTR 

TABLE VI 

ANALYSIS RESULTS FOR THE IPM MOTOR 

precond. total linear ite. total NR ite. time for precond. [h] elapsed time [h] 

IC 1,666,584 (1.00) 4,070 (0.80) 4.84 39.1 (1.00) 

DIC 1,799,885 (1.08) 5,065 (1.00) 4.81 43.6 (1.12) 

EDIC 1,779,282 (1.07) 4,999 (0.99) 4.82 40.9 (1.05) 

SGS 1,400,867 (0.84) 4,996 (0.99) 0.03 29.7 (0.76) 

ESGS 1,406,236 (0.84) 4,993 (0.99) 0.03 28.6 (0.73) 

IC 931,375 (0.56) 4,450 (0.88) 5.24 26.6 (0.68) 

DIC 1,088,578 (0.65) 4,716 (0.93) 4.95 29.7 (0.76) 

EDIC 1,067,788 (0.64) 5,242 (1.03) 4.96 27.6 (0.71) 

SGS 866,637 (0.52) 5,178 (1.02) 0.04 19.6 (0.50) 

ESGS 864,545 (0.52) 5,162 (1.02) 0.04 18.2 (0.47) 


The authors would like to thank Dr. K. Abe and Dr. Y. 

Takahashi for their advice and helpful comments. This 

work was supported by a Japan Society for the Promotion 

of Science (JSPS) Grant-in-Aid for Young Scientists (B) 

(Grant Number: 23760252). 

REFERENCES 

[1] J. A. Meijerink and H. A. van der Vorst, “An iterative solution 

method for linear systems of which the coefficient matrix is a 

symmetric M-matrix,” Mathematics of Computation, Vol. 31, No. 

137, pp. 148-162, Jan. 1977. 

[2] K. Abe, S.-L. Zhang, and T. Mitsui, “MRTR method: an iterative 

method based on the three-term recurrence formula of CG-type for 

nonsymmetric matrix,” The Japan Society for Industrial and 

Applied Mathematics, Vol. 7, No. 1, pp. 37-50, Mar. 1997. (in 

Japanese) 

[3] K. Abe and S.-L. Zhang, “A variant algorithm of the Orthomin(m) 

method for solving linear systems,” Appl. Math. Comput., Vol. 206, 

No. 1, pp. 42-49, Dec. 2008. 

[4] S. C. Eisenstat, “Efficient implementation of a class of 

preconditioned conjugate gradient methods,” SIAM J. Sci. Stat. 

Comput., Vol. 2, No. 1, pp. 1-4, Mar. 1981. 

[5] S. C. Eisenstat, H. C. Elman, and M. H. Schultz, “Variational 

iterative methods for nonsymmetric systems of linear equations,” 

SIAM J. Numer. Anal., Vol. 20, No. 2, pp. 345-357, Apr. 1983. 

[6] K. Fujiwara, T. Nakata, and H. Fusayasu, “Acceleration of 

convergence characteristic of the ICCG method,” IEEE Trans. 

Magn., Vol. 29, No. 2, pp. 1958-1961, Mar. 1993. 

[7] O. Axelsson, “A generalized SSOR method,” BIT Numerical 

Mathematics, Vol. 12, No. 4, pp. 443-467, Jul. 1972. 

[8] A. Shiode, S. Fujino, and K. Abe, “Preconditioning for symmetric 

positive definite matrices of MRTR method,” Trans. JSCES, No. 

20060007, pp. 231-237, Feb. 2006. (in Japanese) 

[9] A. Kameari, “Improvement of ICCG convergence for thin 

elements in magnetic field analyses using the finite-element 

method,” IEEE Trans. Magn., Vol. 44, No. 6, pp. 1178-1181, Jun. 

2008. 

[10] K. Miyata, K. Ohashi, A. Muraoka, and N. Takahashi, “3-D 

magnetic field analysis of permanent-magnet type of MRI taking 

account of minor loop,” IEEE Trans. Magn., Vol. 42, No. 4, pp. 

1451-1454, Apr. 2006. 

[11] Y. Okamoto, K. Fujiwara, and R. Himeno, “Exact minimization of 

energy functional for NR method with line-search technique,” IEEE 

Trans. Magn., Vol. 45, No. 3, pp. 1288-1291, Mar. 2009. 

[12] C. W. GEAR, Numerical Initial Value Problems in Ordinary 

Differential Equations. Englewood Cliffs, NJ: Prentice-Hall, Inc., 

1971. 

[13] Y. Okamoto, K. Fujiwara, and Y. Ishihara, “Effectiveness of 

higher order time integration in time-domain finite-element 

analysis,” IEEE Trans. Magn., Vol. 46, No. 8, pp. 3321-3324, Aug. 

2010.


High Frequency Mixing Rule Based Effective 

Medium Theory of Metamaterials 

Zsolt Szabó 

Department of Broadband Infocommunications and Electromagnetic Theory, 

Budapest University of Technology and Economics, Egry József 18, 1111 Budapest, Hungary, 

E-mail: szabo@evt.bme.hu 

Abstract— The electromagnetic response of metamaterials is governed by the collective behavior of engineered electric and 

magnetic dipoles. Therefore metamaterials may be replaced by hypothetical composites of spherical particles embedded in a 

host material. The effective electric permittivity and magnetic permeability of such systems can be computed with high 

frequency extension of the Maxwell-Garnett mixing rule. The validity of this assumption is discussed and as a benchmark the 

effective electromagnetic material parameters of a deep subwavelength spherical composite are calculated in three different 

ways: with the Maxwell-Garnett mixing rule, high frequency mixing rule and directly extracted from transmission reflection 

data. The developed theory is applied to find the parameters of a composite with similar magnetic response as a metamaterial 

built of split ring resonator. 

Index Terms—metamaterials, effective medium theory, Maxwell-Garnett mixing, Mie theory. 


Recently metamaterials are in focus of very intensive 

research and due to their unique properties are promising 

over the full electromagnetic spectrum [1-3]. The research 

of metamaterials has started with the goal of producing 

materials with negative refractive index i.e. simultaneous 

negative electric permittivity and magnetic permeability 

for imaging applications below the diffraction limit [4]. 

However, the ultimate goal of the metamaterial research 

is to fabricate materials with arbitrarily configurable 

electric and magnetic properties. With an advance in 

micro- and nano-manufacturing techniques there are 

possibilities to produce subwavelength structures that can 

support symmetric and anti-symmetric modes. The 

associated current flow produces electric and magnetic 

dipole moments. A metamaterial with a customized 

optical response can be built as a superposition of such 

nano-elements. A very common design of an artificial 

material with tailored negative permittivity is the wire 

medium [5]. The most common designs to produce 

artificial magnetism are the variations of the split ring 

resonators [6] or pairs of nanorods [7]. The superposition 

of subwavelength structures with negative electric 

permittivity and magnetic permeability can lead to 

negative refractive index [8] even at optical frequencies 

[9]. However the losses and the finite size of the unit cell 

results in a cutoff frequency, limiting the applicability of 

metamaterials. In addition toward optical frequencies it is 

increasingly challenging to fabricate the meta-structures, 

especially the negative magnetic response. 

The design of devices with metamaterials often 

requires the application of the effective medium theory. 

However robust effective metamaterial parameter 

extraction and homogenization are unsolved theoretical 

challenges of the metamaterial research. In spite of 

considerable progress, researchers are still debating the 

fundamental issues and question the validity of the 

effective medium concept, which is considered by many 

as the Achilles-heel of this research field. 

In this paper it is argued that metamaterials can be 

homogenized when their electromagnetic response is 

governed by the excitation of electric and magnetic 

dipoles. The electromagnetic response of spherical 

particles can be replaced with static dipoles when the 

sphere is very small compared to the optical wavelength 

of the incident electromagnetic wave and by radiating 

dipoles, when the size is larger. The analytical formulas 

of the Mie theory explain precisely the scattering 

mechanism. Metamaterials may be equivalent to a 

properly chosen hypothetical composite of spherical 

particles embedded in a host material. Therefore well 

developed effective medium theories of composite 

materials can be applied to metamaterials. The validity of 

this assumption is discussed and as a benchmark the 

effective electromagnetic material parameters of a deep 

subwavelength composite of spherical particles are 

calculated in three different ways: with the Maxwell- 

Garnett mixing rule, high frequency mixing rule and 

extracted directly from transmission reflection data. The 

developed theory is applied to find the parameters of a 

composite with similar magnetic response as a 

metamaterial built up of split ring resonator. 

II. EFFECTIVE MEDIUM THEORIES OF METAMATERIALS 

Several effective medium theories of metamaterials 

have been developed. In Fig. 1 two models of 

metamaterial homogenization are presented. The effective 

metamaterial parameters can be extracted by replacing the 

electromagnetic response of the metamaterials with the 

electromagnetic response of a homogeneous isotropic slab 

is it is shown in Fig. 1.b. The model of Fig. 1.c replaces 

the metamaterial with the hypothetical composite of 

spherical particles embedded in a host material. In both 

cases the electromagnetic properties can be determined in 

such a way that the metamaterial slab and the slab with 

the homogenized material parameters have the same 

reflection S 11 and transmission S 21 parameters. 

When the metamaterial is replaced with homogeneous 

slab, from the Fresnel relations the effective metamaterial 

parameters can be expressed. However the extracted wave 

impedance is exact only in the quasi static limit [10] and

the unique extraction of the refractive index is 

cumbersome due to the branching problem of the 

refractive index; that is the calculation of the refractive 

index involves the evaluation of a complex logarithm that 

is a multi-valued function. To remove this ambiguity, the 

Kramers–Kronig relation can be applied to estimate the 

refractive index from the extinction coefficient [11]. The 

physically realistic exact values of the refractive index are 

determined by selecting those branches of the logarithmic 

function which are closest to those predicted by the 

Kramers–Kronig relation. Finally from the wave 

impedance and from the refractive index the electric 

permittivity and the magnetic permeability can be 

calculated. 

(a) 


x = ε μ ωr 

c , where ω is the angular frequency of 

h h 

r r 0 

the incident radiation and c 0 is the speed of light in 

vacuum, provides the guideline for the validity of the 

Maxwell Garnett mixing rule, with the necessary 

condition x 1. 

However the limits of the Mixing- 

Garnett mixing rule can be extended. The Mie theory 

explains precisely the scattering mechanism of standalone 

spherical particles of any size and offer analytic solution 

in form of infinite series [14]. When the magnetic 

permeability of the host material and of the spherical 

particle is equal, the Mie coefficients are 

mΨn( mx) Ψ′ n( x) −Ψn( x) Ψ′ 

n( 

mx) 

an 

= 

, 

mΨ mx ξ′ x −ξ x Ψ′ 

mx 

b 

n( ) n( ) n( ) n( 

) 

( mx) ′ ( x) m ( x) ′ ( mx) 

( mx) ξ′ ( x) mξ ( x) ′ ( mx) 

Ψ Ψ − Ψ Ψ 

= 

, (4) 

n n n n 

n 

Ψn n − n Ψn 

where m = 

i i 

ε r μr h h 

ε r μr 

is the contrast of the 

refractive index and n Ψ and ξ n are the Riccati-Bessel 

functions. The radiating electric and magnetic dipole 

polarizabilities correspond to the first terms of the 

expansion and can be expressed with the Mie scattering 

coefficients as 

3 

3 

3r 

3r 

α e = i a1, 

α 3 m = i b1. 

3 

2x 

2x 

(5) 

(b) (c) 

Figure 1: Homogenization models of metamaterials 

Substituting (5) in the Clausius-Mossotti relation leads to 

the expressions of the effective electric permittivity and 

with a similar argument to the expression of the effective 

magnetic permeability [15, 16] 

The Maxwell-Garnett mixing rule [10, 12, 13] can 

provide the effective electric permittivity of dilute, two 

component mixtures and it is derived with the assumption 

that the spherical inclusions can be replaced by static 

electric dipoles with polarizability 

3 

eff h x + 3iζa1( 

mx) 

εr = εr 

3 

x − 3 iζa1( mx 

2 ) 

3 

eff h x + 3iζb1( 

mx) 

μr = μr 

3 

x − 3 iζb1( mx 

2 ) 

, 

. (6) 

i h 

εr − εr 

3 

αe 

= r , i h 

εr + 2εr 

(1) 

where 1 

h 

where ε r is the electric permittivity of the host material, 

i 

ε r is the electric permittivity and r is the radius of the 

spherical inclusions. The connection between the 

eff 

polarizability and the effective electric permittivity ε r is 

given by the Clausius-Mossotti relation [13] 

eff h 

εr − εr ζ 

= α 

eff h 3 e , (2) 

εr + 2εr 

r 

where ζ is the filling factor of the spherical inclusion. 

When (1) is substituted in the Clausius-Mossotti relation 

it results in the Maxwell-Garnett mixing formula 

eff h i h 

εr −εr εr −εr 

= ζ . (3) 

eff h i h 

εr + 2εr εr + 2εr 

In this relation, the size of the spherical inclusions is not 

appearing in a direct way; the filling factor ζ is the only 

geometry factor in the Maxwell-Garnett formula. The 

static dipole approximation is valid only for spheres, 

which are very small compared to the optical wavelength 

of the incident electromagnetic wave. The size parameter 

a and b 1 are the first terms of the Mie scattering 

coefficients and i = − 1 is the imaginary unit. The 

evaluation of a 1 and b1 is trivial, because in (5) for 

n = 1 , the Riccati-Bessel functions and the derivatives 

can be expressed with simple expression of trigonometric 

functions as 

sin ρ 

Ψ 1 ( ρ) = − cos ρ , 

ρ 

1 cosρ 

Ψ ′ 1 ( ρ) = sin ρ1− 

2 + 

, 

ρ ρ 

cos ρ 

ξ1( ρ) =Ψ1( ρ) − i + sin ρ 

ρ , 

1 

ξ′ 1( ρ) =Ψ ′ 1( ρ) + i Ψ 1( ρ) + cos ρ 2 

ρ . 

When the size of the spherical inclusions is not small 

enough to be replaced with static dipoles, but it is small 

enough to disregard all higher order modes of (4) then the 

high frequency mixing formulas (6) are applicable. Note 

that the resonance based magnetic metamaterials are 

working under similar conditions [1]. Metamaterials has

finite unit cell sizes, and especially magnetic 

metamaterials has unit cells, which are not deep 

subwavelength. The strength of the resonance decreases 

with the size of the unit cell and the resonance is not 

strong enough to produce negative permeability for 

structures with deep sub-wavelength elements. 

Metamaterials with larger unit cell can support higher 

order modes at frequencies, which are just slightly 

different than the frequency region where the double 

negative behavior occurs. Special care must be taken 

when metamaterial parameters are extracted directly from 

transmission reflection data, and it is not sufficient to 

enforce the continuity of the refractive index, because we 

may extract erroneous effective metamaterial parameters 

for frequency regions where they do not even exist. The 

high frequency mixing, which is based on the Mie theory 

provides estimate for the limits of the homogenization. 

III. EFFECTIVE MATERIAL PARAMETERS OF COMPOSITE 

WITH SPHERICAL METALLIC INCLUSIONS 

In this section the effective parameters of the 

composite material with the unit cell illustrated in Fig. 2.a 

are calculated. This composite serves as benchmark to 

compare the effective material parameters calculated with 

the Maxwell Garnett mixing rule, the high frequency 

mixing rule and directly extracted from transmission 

reflection data. The geometry and the composition are 

selected such that the size parameter of the spheres at 

optical frequencies satisfies the condition x 1. 

The 

length of the cubic unit cell is 15 nm and the radius of the 

sphere is 3 nm. The spherical inclusions are made of Ag 

and are embedded in SiO2 host and the calculations take 

into account the frequency dispersion of the materials 

parameters. Fig. 2.b presents the electric permittivity of 

the Ag inclusions, and Fig. 2.c plots the electric 

permittivity of the SiO2 host [17, 18]. The composite is 

considered infinitely large in the x and y directions (see 

Fig. 2.a) and only-one-unit-cell thick in the z direction. 

The Maxwell-Garnett type mixing rules do not require 

cubic unit cells; the requirement is that the inclusions are 

separated. For periodically arranged spherical inclusions, 

when the filling factor is high, the Maxwell-Garnett 

mixing rule has to be modified [12]. On the other hand 

disorder and inaccuracy of shapes destroys the collective 

effects and extends the limits of the theory. 

The aim of the calculation is to determine the effective 

parameters of this composite in the frequency range from 

0.4 to 1 PHz. The calculations of the transmission 

reflection data (S-parameters) of this paper are performed 

with the frequency-domain solver of the commercial 

software CST Microwave Studio [19]. Due to periodicity, 

one unit cell with perfect electric conducting and perfect 

magnetic conducting boundary conditions in the x and the 

y directions is sufficient to calculate the S-parameters 

below the frequencies where diffraction occurs. In the z 

direction additional air regions are added to the 

computational space by positioning waveguide ports at 

one-unit-cell distance from the surface of the composite. 

The fundamental mode of the waveguide ports is excited 

to launch a plane wave, which is propagating along the z 

direction; at the same time the waveguide ports act as 


absorbing boundary condition and permits the automatic 

calculation of the S-parameters [19]. The online algorithm 

[20] is applied to extract the electromagnetic parameters 

from the S parameters. To get a good estimate for the 

Kramers–Kronig integral, the simulations cover the 0.25– 

1.25 PHz frequency interval. When this frequency 

interval is even larger, the accuracy of the Kramers– 

Kronig approximation does not change noticeably in the 

frequency range of interest. 

(a) 

(b) 

(c) 

Figure 2: The geometry of the composite material is 

shown in (a), the electric permittivity of the spherical 

inclusions made of Ag is presented in (b) and the electric 

permittivity of the SiO2 host materials is plotted in (c). 

Fig. 3.a and 3.b presents the magnitude and phase of 

2 2 

the S-parameters. The absorption A = 1− S11 

− S21 

in 

function of frequency is plotted as well, showing a 

resonant peek at f 1 = 0.7192 PHz. 

The effective electric permittivity of the composite 

material, which is calculated in three different ways, with 

the high frequency mixing rule, with the Maxwell-Garnett 

mixing and extracted from the S-parameters, are 

presented in Fig. 4. The magnetic permeability is obtained 

from the high frequency mixing rule and it is extracted 

from the S-parameters as well, and it has values close to 

one over the frequency range of interest. Comparing the 

real and imaginary parts of the electric permittivity 

obtained with the three different methods, a very good 

agreement can be observed. The peak in the imaginary 

part of the electric permittivity corresponds to the 

absorption peek of Fig. 3.a, which reveals that it is 

electric resonance. The electric permittivity has Lorentz 

shape and can be successfully fitted with a single 

oscillator model

( − ) 

2 

εrs εr∞ω0 εr ( ω) = εr∞+ 

. (7) 

2 2 

ω0+ iδω−ω 

where the static electric permittivity ε rs = 2.379 , the 

electric permittivity at very high frequencies ε r∞ 

= 2.23 , 

the resonant frequency ω0= 2π ⋅ 0.7228 rad/fs and 

damping constant δ = 0.33 1/fs. 

(a) 

(b) 

Figure 3: The S-parameters of the one-unit-cell thick 

composite material. In (a), the magnitude, and in (b) the 

phase of the S-parameters is plotted. Note the absorption 

peek at f 1 = 0.7192 PHz. 

Figure 4: The effective electric permittivity of the 

composite material calculated with the high frequency 

mixing rule, Maxwell-Garnett mixing and extracted from 

the S-parameters. 

The electric permittivity at the frequency of the 

absorption peek is ε = 2.5319 + 2.0465i 

and the 

r 


corresponding optical wavelength is 

λ opt = c0 ( n f1) 

= 245.04 nm, which is much larger than 

any characteristic dimension of the composite (the size of 

the unit cell is 15 nm), showing that the resonant behavior 

is related to the composition rather than structuring. 

IV. EQUIVALENT COMPOSITES OF METAMATERIALS 

DESIGNED WITH THE HIGH FREQUENCY MIXING RULE 

In this section the equivalent composite of a magnetic 

metamaterial is determined. The geometry of the 

metamaterial is the well studied split ring resonator [1, 2, 

3, 8] as it is shown in Fig. 5. The dimensions and the 

material parameters are the same as in [8]. The size of the 

cubic unit cell is 5 mm, the split ring resonators are made 

of copper, the outer length of the exterior split ring 

resonators is 3 mm, the width of both split rings is 

0.25 mm, the thickness is 0.02 mm, the size of the gaps 

and the distance between the split ring resonators is 

0.5 mm. The substrate is made of dielectric with 

ε r = 3.84 and the thickness of the substrate is 0.25 mm. 

The metamaterial is periodic in the direction 

perpendicular to the propagation of the electromagnetic 

wave (z direction), the electric field is polarized in y 

direction, which means that the magnetic field is 

perpendicular to the plane of the split ring resonators. The 

metamaterial of [8] was designed to experimentally 

demonstrate the negative refraction. The role of the splitring 

resonators is to provide the negative magnetic 

response, while additional copper wires placed on the 

back side of the substrate are responsible for producing 

the negative electric permittivity, leading to a negative 

refractive index at a frequency of 10 GHz. In our 

numerical simulations the metamaterial is only-one-unitcell 

thick. 

Figure 5: The unit cell of the magnetic metamaterial slab 

is composed of metallic split-ring resonators. 

The reflection, transmission and absorption spectrum 

of the double negative metamaterial [8] is presented in 

Fig. 8.a, while in Fig. 8.b the electromagnetic response of 

the split ring resonators is shown. The simulations reveal 

that the position of the resonant peek at 10 GHz is not 

changed by removing the wires; nevertheless the shape of 

the transmission and reflection curves is greatly affected.

The effective parameters of the magnetic metamaterial 

built of split ring resonators are extracted from the Sparameters 

with [20] and are shown in Fig. 7.a. The 

magnetic permeability has Lorentz shape with negative 

values and it is similar to the magnetic permeability of 

[8]. The effective electric permittivity has a shape of antiresonance, 

which may be an artifact caused by the 

replacement of the anisotropic metamaterial structure with 

the homogenized model of isotropic slab. 

(a) 

(b) 

Figure 6: In (a) the reflection, transmission and 

absorption spectrum of the double negative metamaterial 

is presented, while in (b) the electromagnetic response of 

the split ring resonators is shown. 

Spectral fitting is carried out to find the parameters of 

the composite, which is magnetically equivalent to the 

metamaterial, built of split ring resonators. The 

parameters of the high frequency mixing rule, the radius 

and the electric permittivity of the spherical inclusions, 

the filling factor and the electric permittivity of the host 

material are determined by minimizing the mean square 

error, 

2 2 

N HF TR HF TR 

Re( μri ) Re( μri ) Im ( μri) Im ( μ 

− − ri ) 

+ 

TR TR 

i= 1 Re( μri ) Im( 

μri 

) 

 

Ω= 

 

2N 

where N is the number of data points in the spectra, Re() 

and Im() return the real and imaginary parts of the 

magnetic permeability 

μ extracted from the S- 

TR 

r 

parameters or HF 

μ r calculated with the high frequency 


mixing rule. The minimization is performed with the 

differential evolution algorithm [21]. The minimization 

provides r = 2.31 mm for the radius of the spherical 

inclusions, the electric permittivity of the inclusions is 

i 

ε r = 37.67 , the filling factor is ζ = 0.13 and the electric 

h 

permittivity of the host material is ε r = 1. 

Note that the 

results of this optimization are implementable; several 

materials exist at microwave frequencies with even higher 

electric permittivity and are available as powder or 

suspension [7]. 

(a) 

(b) 

Figure 7: In (a) the effective magnetic permeability and 

electric permittivity of the metamaterial made of split-ring 

resonators extracted from S-parameters is shown. In (b) 

the material parameters of the equivalent composite are 

presented. 

The real and imaginary parts of the magnetic 

permeability and the electric permittivity of the equivalent 

composite are plotted in Fig. 7. b. As it can be seen there 

is a good agreement between the effective magnetic 

permeability of the metamaterial and the permeability of 

the composite. Comparing the real part of the electric 

permittivities it can be observed that they are comparable 

TR 

at low frequencies, for example at 5 GHz ε = 1.57 and 

ε = 1.43 , even though there is no optimization goal 

HF 

r 

formulated for permittivity in the mean square error of the 

minimization procedure. On the other hand there is no 

anti-resonant behavior in the electric permittivity of the 

composite in the frequency region of the magnetic 

resonance. The magnetic resonance of the composite is 

followed by electric resonance, which appears at the 

r

upper end of the investigated frequency region. To move 

the electric resonance outside of this frequency region, 

the bounds of the optimization parameters were changed 

and several minimization runs were performed. As a 

result it can be observed that the model does not provide 

enough freedom to maintain the strength and the position 

of the magnetic resonance and at the same time to change 

the position of the electric resonance to higher 

frequencies. The extension of the high frequency model to 

ellipsoidal particles may solve this issue. 

In Fig. 8 the magnitudes of the S-parameters for the 

metamaterial built of split ring resonator and those for the 

equivalent composite are presented. The difference 

between the curves is due to the difference in electric 

permittivities. The correspondence may be improved by 

considering frequency dependent material parameters. 

Figure 8: Comparison between the magnitudes of the 

S-parameters of the metamaterial built of split ring 

resonator and the S-parameters of the equivalent 

composite. 


High frequency mixing rule, which is based on the 

Clausius-Mossotti relation and the first terms of the Mie 

expansion corresponding to radiating dipoles has been 

applied to characterize composites and metamaterials. To 

validate the model the effective electric permittivity and 

magnetic permeability of a deep subwavelength 

composite were calculated and it was shown that similar 

results are produced by the high frequency mixing rule, 

the Maxwell-Garnett mixing rule or extracted directly 

from the S-parameters. 

The developed high frequency model can open 

alternative ways to engineer required electromagnetic 

properties. It was shown that equivalent composite, which 

has similar effective magnetic permeability, can be 

assigned to the magnetic metamaterial built of split ring 

resonators. 

VI. ACKNOWLEDGEMENT 

This work has been supported by the János Bolyai 

Research Fellowship of the Hungarian Academy of 

Sciences and OTKA 105996. 


[1] 

REFERENCES 

L. Solymár and E. Shamonina, Waves in Metamaterials, Oxford, 

University Press, 2009. 

[2] Marqués R., Martín F., Sorolla M., Metamaterials with 

NegativeParameters. John Willey and Sons, 2008. 

[3] N. Engheta, R. W. Ziolkowski, Metamaterials Physics and 

Engineering Applications, John Willey and Sons, 2006. 

[4] J. B. Pendry, Negative Refraction Makes a Perfect Lens, Physical 

Review Letters, vol. 85, no. 18, pp. 3966–3969, 2000. 

[5] J.B. Pendry, A.J. Holden, D.J. Robbins, and W.J. Stewart, 

Magnetism from Conductors and Enhanced Non-Linear 

[6] 

Phenomena, IEEE Transactions on Microwave Theory and 

Techniques, vol. 47, 2075, 1999. 

D. R. Smith, W. Padilla, D. Vier, S. Nemat-Nasser and S. Schultz, 

Composite Medium with Simultaneously Negative Permeability 

and Permittivity, Phys. Rev. Lett., vol. 84, p. 4184, 2000. 

[7] V. M. Shalaev, Optical negative-index metamaterials, Nature 

Photonics, vol. 1, pp. 41-48, 2006. 

[8] R. A. Shelby, D. R. Smith, S. Schultzm, Experimental 

Verification of a Negative Index of Refraction Science, vol 292, 

pp. 77-79, 2001. 

[9] G. Dolling, M. Wegener, C. Soukoulis and M. S. Linden, 

Negative-index metamaterial at 780 nm wavelength, Opt. Let., 

vol. 32, no. 1, pp. 53-55, 2007. 

[10] A. F. de Baas (editor), Nanostructured Metamaterials, European 

Comission, 2010. 

[11] Zs. Szabó, G.-H. Park, R. Hedge, and E.-P. Li, “A unique 

extraction of metamaterial parameters based on Kramers-Kronig 

relationship,” IEEE Trans. Microwave Theory Tech., vol. 58, no. 

10, pp. 2646-2653, 2010. 

[12] A. Sihvola, Electromagnetic Mixing Formulas and Applications, 

The Institution of Electrical Engineers, London, United Kingdom, 

1999. 

[13] D. E. Aspnes, Local-field effects and effective-medium theory: A 

microscopic perspective, Am. J. Phys., vol. 50, no. 8, pp. 704-709, 

1982. 

[14] C. F. Bohren, D. R. Huffman, Absorption and Scattering of Light 

by Small Particles, Wiley-VCH, 2004. 

[15] R. Ruppin, Evaluation of extended Maxwell-Garnett theories, 

Optics Communications, vol 182, pp. 273–279, 2000. 

[16] C. A. Grimes, D. M. Grimes, Permeability and permittivity 

spectra of granular materials, Phys. Rev. B, vol. 43, pp. 10780– 

10788, 1991. 

[17] E.D. Palik and G.K. Ghosh, Editors, Handbook of Optical 

Constants of Solids, Academic Press, New York, 1997. 

[18] [Online] http://www.sspectra.com/sopra.html 

[19] [Online] www.cst.com 

[20] [Online] http://effmetamatparam.sourceforge.net/ 

[21] K. V. Price, R. M. Storn, J. A. Lampinen, Differential Evolution, 

A Practical Approach to Global Optimization, Springer, 2005.


Enhancement of Maximum Starting Torque and 

Efficiency in Permanent Magnet Synchronous Motors 

Jawad Faiz, Vahid Ghorbanian and Bashir Mahdi Ebrahimi 

Center of Excellence on Applied Electromagnetic Systems, School of Electrical and Computer 

Engineering, College of Engineering, University of Tehran, Tehran 1439957131, Iran 

(e-mail: jfaiz@ut.ac.ir) 

Abstract— This paper presents a new algorithm for enhancement of maximum starting torque and steady-state efficiency in 

permanent magnet (PM) motors. This algorithm includes two strategies which are used to raise starting torque and decrease 

losses in PM motors. Therefore, transient and steady-state operations of the PM motor are improved. It is essential to model 

the core in the efficiency estimation of losses of the PM motor. Appropriate control coefficients based on the introduced 

algorithm are set for two aforementioned goals. Simulation results are presented the competency of the proposed algorithm. 

Index Terms— PM motor, control strategy, starting torque, losses minimization, efficiency. 


Application of permanent magnet (PM) motors is 

increasing due to the advanced technology in PM 

manufacturing, and its high power density, improved 

power factor and high efficiency. Some applications such 

as electric and hybrid vehicles with huge start-stop 

actions need high starting torque to quickly accelerate the 

vehicles. Enhancement of the starting torque involves the 

increase of the stator windings currents leading to 

windings temperature rise. Therefore, the stator current 

must be limited. Since the performance improvement of 

these motors has considerable effects upon the electrical 

power consumption over long time applications, the 

instantaneous optimal control and appropriate operating 

point over different loads and speeds should be 

considered. 

Nowadays, a wide range of the motor speeds and 

torques are achieved by application of vector control 

methods in the motors [1]-[4]. These control methods 

provide stability and precise required speed and accurate 

response because of the feedback in the motor which can 

control the flux and torque independently [5]. The 

controllable quantities are id and iq currents. By 

controlling current id, the motor flux and consequently 

speed is adjusted and by controlling current iq, the steadystate 

output torque is regulated [6]. In [7], [8], a method 

has been introduced to control the torque of the PM 

machine by limiting the windings currents. This machine 

has been used as a generator of a wind turbine. When the 

wind speed is higher than that of the rated speed, torque 

rises and the windings insulation may fail. In some 

methods [7], there is no need to have a mechanical sensor. 

Since magnetic saturation has considerable impact on the 

motor behavior at high currents; the saturation effects are 

approximately modeled in this paper. Meanwhile, strategy 

of maximum torque control is normally applied to the 

motor in the non-starting case [3]. 

Different strategies have been so far applied to improve 

the efficiency of the PM motor. In [9], efficiency of the 

motor has been improved through introducing a new teeth 

and slot structure. In [7], [12], a method based on the dq 

model of the motor has been proposed in which the core 

losses have been taken into account by a resistance in the 

equivalent circuit of the motor. This technique is called 

the “loss model control”, in which copper and core losses 

are evaluated as an analytical function of equivalent 

circuit parameters and id and iq currents of the motor. By 

obtaining the optimal currents, the motor losses can be 

minimized and efficiency maximized. In interior PM 

(IPM) motors Lq and Ld are unequal and it is difficult to 

obtain the optimal operating point at different speeds and 

torques analytically. To overcome this problem, normally 

id and iq are expressed as functions of the motor speed and 

its coefficients are stored in a lookup table. The objection 

of this method is that by increasing the motor speed and 

torque ranges, the tables will be larger and it must be 

updated by change of the motor parameters. In [1], [5], 

[10], motor efficiency has been improved by id =0 

method. In the PM motor model, motor reluctance torque 

appears as coefficient of id and by putting id =0, the 

reluctance torque as an opposed torque is eliminated and 

consequently its output power increases for a fixed speed. 

Applying id=0 to IPM motors needs a high power 

inverter. Therefore, this method is normally applied to 

surface-mounted PM (SPM) motor [10]. The unity power 

factor method can develop less maximum torque 

compared to other methods [10]. So, it is not suitable in 

the high torque applications. Flux-linkage control method 

presents a better performance in IPM. 

On contrary to the loss model control which depends 

on the motor model accuracy, methods presented in [2], 

[3], [13] is called search control method which is 

independent of the motor and drive parameters. In this 

method, attempt has been made to reduce the input power 

and this is normally done through the control of voltage 

or dc link current of the inverter. Application of this 

control method may produce undesirable oscillations in 

the torque and speed of the motor which leads to 

instability [14]. In this method the use of a frequency 

stabilizer is necessary. 

Previous papers have not taken into account both high 

starting torque and steady-state efficiency improvement in 

PM motor. This paper investigates the control strategy of 

the maximum torque and efficiency enhancement in the 

steady-state operation of the motor. By application of this 

method, torque raises up to 4 pu and current up to 2 pu.

Distinction of this paper and [10] is the design of 

intelligent system for applying the limitation on id and iq, 

So, at any instant sensitivity of torque against each current 

components is measured and a component that has less 

effect in of the torque development is limited. To improve 

the steady-state efficiency of the motor, loss model 

control algorithm of [5] is used. In section II the motor 

model is introduced. In section III control strategy is 

described. Section IV and V present the simulation 

method and results respectively. Finally section VI 

concludes the paper. 

II. MODEL OF MOTOR 

The proposed control strategies in this paper are based 

on the analytical equations of the motor model. Since 

enhancement of the maximum starting torque and steadystate 

efficiency of the motor are carried out through the 

control of id and iq current vectors, the Park’s model 

converts three-phase abc equations of the motor into twophase 

dq equations in which id and iq currents of the 

motor are available. Figure 1 shows the IPM motor 

model where Ra is the stator resistance, Ld is the d-axis 

inductance and Lq is the q-axis inductance. 

Figure 1: Two-axes model of IPM motor 

These two inductances are not equal in the IPM motors 

and they develop a considerable reluctance torque. Rc is 

the iron losses equivalent resistance. The iron losses 

consist of the hysteresis and eddy current losses, and are 

modeled by equivalent resistance Rc, which depends on 

the temperature and frequency. To simplify the 

computations, this resistance is evaluated at the rated 

conditions. In [5], leakage and magnetizing inductances 

are separated and inserted in different branches. Since Rc 

is very larger than that of the other impedances, the 

current of the losses branch is small and the current of the 

left hand side and right hand side have no considerable 

difference. Therefore, sum of leakage and magnetizing 

inductances are used in the model. 

The governing equations of the motor model are as 

follows: 

diod 

vd 

Raid 

 

Lqioq 

Ld 

(1) 

dt 


dioq 

vq 

Raiq 

Ldiod 

a Lq 

dt 

(2) 

icd id 

iod 

(3) 

icq iq 

ioq 

(4) 

diod 

( Lqioq 

Ld 

) 

i 

dt 

cd 

Rc 

(5) 

dioq 

( ( 

Ldiod 

 

a ) Lq 

) 

i 

dt 

cq 

RC 

The developed electromagnetic torque of the motor is: 

(6) 

3P 

Te ( )[ aLq 

( Ld 

Lq 

) iodioq 

] 

2 

The dynamic equation of the motor is as follows: 

(7) 

dr 

Te 

Tm 

c sign( 

r ) Fr 

J 

dt 

(8) 

III. CONTROL STRATEGY 

A. Enhancement of Maximum Torque 

Eqn. (7) indicates that the motor torque depends on id 

and iq components of currents. At the starting, stator 

current does not so much depend on the load, and 

normally is 2 to 2.5 times the rated value. The starting 

torque of the motor can be up to 2.5 times the rated 

torque and generally there is no need to reduce it. 

However, in some applications such as ABS brake of cars 

the motor must have very short declaration time (about 

fractional of ms), and the starting torque, about 2.5 times 

the rated torque, cannot response quickly in the no control 

mode. Therefore, it is necessary to apply the high starting 

torque in the case of no control case over longer time in 

order to provide an appropriate declaration time. In the 

previous studies some control methods have been 

proposed to increase the starting torque; however a 

limited starting current has not been considered. Here a 

novel technique is introduced that optimally limits the 

stator current under vector control and also enhanced the 

maximum starting torque. By applying this method, the 

starting torque rises up to 4 times the rated torque. The 

basis of this method is the use of id and iq components of 

current. Since the equivalent resistance of the iron losses 

is almost infinite: 

i i 

q 

oq 

id iod 

(10) 

Suppose the stator current is 

starting period, iq is as follows: 

is constant during the 

2 2 2 2 2 2 

(11) 

i i i i i i 

q 

d 

s 

q 

s 

d 

Combining (7) and (11) leads to: 

3P 

2 2 

(12) 

Te ( ) is 

id 

[ a ( Ld 

Lq 

) id 

] 

2 

where is can be taken as 2 to 2.5 times the rated current. 

In practice, the back-emf increases by acceleration of the 

motor and this decreases the stator current. However, to 

simplify the equations it was taken to be constant. The 

(9)

optimal id is obtained by putting the derivative of the 

maximum torque versus id equal to zero: 

2 

dTe 

a 

a 2 is 

0 id 

 

( ) 

2 

2 

did 

4( 

Ld 

Lq 

) 4( 

Ld 

Lq 

) 2 (13) 

For positive id, the reluctance torque increases and the 

total torque of the motor reduces. So, only the negative 

sign is acceptable. 

2 

a 

a 2 is 

id 

( ) 

2 

2 

4( 

Ld 

Lq 

) 4( 

Ld 

Lq 

) 2 

(14) 

By applying id from (14) to the reference point id, 

torque rises. But the stator current becomes larger than 

the permissible current by applying the obtained 

components of the current. Therefore, the torque 

sensitivity versus the current components is calculated 

and the current that has less influence the torque 

development is limited: 

T Te 

3P 

e 

(15) 

Si | i cte ( )( Ld 

Lq 

) iq 

d 

q 

id 

2 

T Te 

3P 

e Si | i cte ( )( a ( Ld 

Lq 

) iq 

) 

q 

d 

(16) 

iq 

2 

By applying this limit, the stator current does not rise 

further than 2 times the rated current. 

B. Improvement of Steady-state Efficiency 

The major factor in the efficiency reduction of the 

motor is the increase of the copper and iron losses. In the 

vector controlled motor supplied by an inverter, high 

order harmonics generates additional losses. In spite of 

this, the major part of the losses allocated to the 

fundamental harmonic. Copper losses directly and iron 

losses indirectly is proportional with the motor current. 

The copper and iron losses arising from the fundamental 

harmonic is optimized by vector control of the stator 

currents. The high order harmonic losses are 

uncontrollable. The basis of the losses control is the 

estimation of the motor losses using the presented model 

and its optimization versus the motor currents. Since 

efficiency is defined in steady-state, the time derivatives 

of the dynamic equations are set equal to zero and losses 

are calculated as follows: 

Lqioq 

2 

( 

iod 

) 

3R 

2 2 3R 

Rc 

 

Wcu 

( iod 

, ioq 

, ) ( )( id 

iq 

) ( ) 

 

2 

2 ioq 

( 

a Ldiod 

) 2 

( 

) 

 

 

 

Rc 

 

(17) 

2 

2 

3R 

 

( ) 

c 2 2 3 

Lqioq 

W fe( 

iod 

, ioq, 

) ( )( icd 

icq) 

( ) 

 

2 

2R 

2 (18) 

c ( a Ldiod 

) 

where Wcu and Wfe are the copper losses and iron losses 

respectively. The motor losses are function of iod, ioq and 

. In these equations, the influence of temperature rise 

and higher harmonics on the resistances and magnetic 

saturation upon the inductances have been ignored and 

taken to be constant. However, the losses increase 

nonlinearly due to the high harmonics and considerable 

rise of the magnetizing current because of the saturation. 

The magnetic saturation occurs normally at starting, and 

at the steady-state mode the motor operates at the knee of 

the magnetization characteristic; therefore, neglecting the 


saturation is acceptable. By combining (7), (17), (18), the 

following equation is obtained: 

Wc W fe( 

iod 

, Te 

, ) Wcu 

( iod 

, Te 

, ) 

Wc 

( iod 

, Te 

, ) 

(19) 

As indicated in (19), the total losses of the motor depend 

on function of the operating point and iod current. At the 

operating point with fixed speed and torque, the optimal 

iod is obtained for losses reduction using the analytical 

derivative of (19).However, in the IPM motor, Ld and Lq 

are not identical, therefore the equations are complicated 

and use of analytical derivative is difficult. Sometimes, 

the currents of the motor are expressed as a polynomial 

versus each other where its polynomial coefficients 

depending on the speed of the motor. The coefficients of 

the polynomials versus the motor operating point are 

stored in a look-up table. However, this method is quick 

but interpolation over different operating points leads to a 

highly approximated method. Meanwhile, the tables over 

wide range of speed and torque become large, and these 

tables must be updated by change of the motor type. 

Flow-chart reported in [5] has optimized the motor losses 

without using analytical derivative and also lookup table. 

The presented algorithm is an iterative one and it 

normally converges to an appropriate solution after 14 

iterations. In the present paper, the section related to the 

response of the final conditional expression reported in 

[5] is modified and therefore the optimal response point is 

achieved with lower number of iterations at any operating 

point. This algorithm increases the developed 

electromagnetic torque of the motor at a constant of 

operating point and consequently efficiency of the motor 

improves. Since iq is the torque component of the current, 

its value depends largely on the load of the motor and its 

large change will lose the stable operating point. 

Therefore, the reluctance torque value is controllable 

by change of id. In the traditional methods such as id=0, 

the reluctance torque is almost zero and demagnetization 

effect of the stator current diminishes. The idea used in 

the new method is to make negative id which makes the 

reluctance torque positive and improves the efficiency 

Figure 2: Loss minimization algorithm

Figure 3: Motor and control system 

of the motor at fixed speed. idmax is generally taken to be a 

small positive value and idmin a large negative value. d 

defines the step variations of id and x the mean value of id 

in every step. To achieve an appropriate response the step 

number depends on the value of id which is fixed at an 

optimal value. The simulation results show the 

improvement of the motor efficiency by applying the 

presented control method compared to id =0 method. 

IV. SIMULATION METHOD 

The above-mentioned control strategies are applied to a 

PM motor under vector control. The output of these 

control methods provides the reference values of the drive 

current. Since the motor supply under vector control is 

PWM type, the high order odd harmonics are injected to 

the motor. Amplitude of these harmonics varies with 

changing the operating point. Figure. 3 shows the 

complete system of the motor and drive. The LMA block 

is for efficiency improvement strategy in steady-state and 

T/A block is for the maximum starting torque 

enhancement. Limitation block limits the motor currents. 

The reference current values are transformed into the twophase 

reference voltages by Vqd_ref. Then the motor 

reference voltages are formed by transforming the twophase 

to three-phase voltages and applying to the PWM 

block. The maximum starting torque algorithm during 

transient and efficiency improvement algorithm during 

the steady-state periods are applied to the motor. In the 

motor model, a low-pass filter is used for eliminating 

high-order harmonics from the control process. 

Specifications of the simulated motor have been 

summarized in Table I. 

V. SIMULATION RESULTS 

The results have been obtained by simulation of the 

motor under control using Simulink. First, the results of 

applying T/A algorithm during the transient mode to the 

motor is considered. By using motor parameters and 


TABLE I 

NAMEPLATES AND PARAMETERS OF IPM 

MOTOR 

Number of poles 6 

Rated Torque (Nm) 1.8 

Rated rms current (A) 3.6 

Rated speed (rpm) 4000 

Stator winding resistance Ra ( 2.21 

Core loss equivalent resistance Rc ( 840 

Direct axis inductance (mH) 9.77 

Quadrature axis inductance (mH) 14.94 

Permanent magnet flux a (Wb) 0.0844 

Mechanical losses (Nm) 0.04 

considering constant maximum current of the motor, id 

component of the stator current is calculated by T/A and 

applied to the input reference of the drive. Since a high 

torque is necessary at the starting, at the first instant the 

current id increases in the negative direction considerably 

(Figure. 4a). Negative current id leads to the positive 

reluctance torque and increases the total torque of the 

motor. By applying T/A, value of iq also increases and the 

motor current rises over permissible limit. 

Therefore, currents are limited. Figure. 4b exhibits the 

variations of the normalized electromagnetic torque of the 

motor (based on the rated values) in which the torque 

raises up to 3.8 pu due to T/A applications. This torque is 

very larger than the case in which the motor is able to 

develop with no control strategy. In order to study the 

performance of the stator current limiter system, threephase 

currents of the motor are shown in figure. 5 which 

indicates that the phase current of the motor raises up to 

1.8 pu. This current rises up to 2.5 pu when current 

limiter is not used. The reason for a constant torque 

during the transient mode is that the tolerable peak 

current by the stator winding is assumed constant. If this 

current as a function of the motor emf is applied to the 

model, the motor torque will decrease by time. After 

completion of the transient period, the control algorithm

iabc(pu) 

Speed(rpm) 

id(A) 

torque(pu) 

-5 

0 0.01 0.02 0.03 0.04 0.05 

time(s) 

2 

1 

0 

-1 

0 

-1 

-2 

-3 

-4 

4 

3.5 

3 

2.5 

2 

1.5 

1 

0.5 

0 0.01 0.02 0.03 0.04 0.05 

time(s) 

-2 

0 0.01 0.02 0.03 0.04 0.05 

time(s) 

5000 

4000 

3000 

2000 

1000 

(a) 

(b) 

Figure 4: (a) id and (b) motor torque at starting period 

is converted into LMA. Change of the control strategy is 

also visible in the three-phase currents of the motor which 

creates some irregularities in the currents. In addition to 

limiting the current, reducing overshoot 

Figure 5: Three-phase currents of controlled motor 

with T/A control 

Without T/A control 

0 

0 0.01 0.02 0.03 0.04 

time(s) 

Figure 6: Time variations of motor speed 

and shortening settling time of the motor speed are other 

advantages of applying T/A. Figure. 6 compares the 

motor speed variations without T/A application and 


controlling cases. Settling time of the motor from 0 to the 

rated speed decreases from 0.025 in the no control case to 

0.01 in the application of T/A and without overshoot. In 

fact, by applying the maximum torque control, the motor 

becomes more stable. It is noted that the torque jump due 

to switching from T/A to LMA is not present in the speed 

signal. The reasons are the high inertia and long 

mechanical time constant of the motor compared to its 

electrical time constant. The LMA attempts to find the 

optimal id for the efficiency improvement. Normally, id is 

chosen negative values by applying this algorithm. The 

influence of the negative id is the enhancement of the 

torque and decrease of losses in the motor. Simulation of 

PM motor under LMA control over a wide range of the 

speed and torque has been carried out and the effect of 

this algorithm on the motor variables and system 

efficiency has been investigated. Meanwhile, the outputs 

of LMA with id=0 are compared and advantage of this 

method over conventional methods is given. 

Figure. 7 shows the variations of the copper and iron 

losses of the motor versus speed and torque. By raising 

the speed at the rated load, the iron losses increase and in 

this case the losses reduction algorithm shows its 

dominant effects. Also at fixed speed and high loads, 

reduction of the total iron and copper losses is 

considerable. Meanwhile, by increasing the negative 

value of id, demagnetization effect of PM decreases and 

for a fixed output power, supply voltage reduces. This 

means the efficiency improvement. Difference between 

the motor losses in two cases id=0 and LMA causes the 

motor efficiency change. 

Figure. 8 shows the efficiency versus speed and torque 

of the motor. Efficiency of the motor has been compared 

in two id=0 and LMA cases. According to figure. 8a, 

efficiency of the motor under LMA control over different 

speeds, shows a relative increase by id=0 method. By 

increasing the speed of the motor, the rate of efficiency 

improvement also rises. It means that whatever the motor 

approaches more to the rated operating point, its 

efficiency improves. Figure. 8b shows the efficiency 

versus load torque at the rated speed, and it emphasizes 

the efficiency improvement of the LMA in the motor 

compared to that of the conventional methods. The 

impact of this method is higher for higher loads. As 

shown in figure. 8a, there is no much difference between 

LMA and id=0 at low load levels. So, efficiency will not 

be considerably changed. This is not true over the low 

speeds. Generally, electrical machines operate in the knee 

of the magnetization characteristic where they have peak 

power density; therefore they have the maximum 

efficiency at the rated operating point. By applying LMA 

at the rated operating point, a 3% rise of the efficiency 

occurs. In the references, LMA algorithm has been 

applied to the PM motor experimentally. The difference 

between the simulation and experimental results is due to 

the approximations included in the simulation. The most 

important factor is ignoring the magnetic saturation.

Total loss(W) 

82 

80 

78 

76 

74 

72 

70 

LMA 

id=0 

68 

500 1000 1500 2000 2500 3000 3500 4000 

Speed(rpm) 

Total losses(W) 

120 

100 

80 

60 

40 

20 

LMA 

id=0 

0 

0 0.5 1 1.5 2 

torque(Nm) 

Comparision of the motor efficiencies at rated load (1.8 N.m) 

- 130 - versus the angular 15th speed in the case IGTE of LMA and id=0 controls. Symposium 2012 

Efficiencies [%] 

Efficiencies [%] 

100 

90 

80 

70 

60 

50 

40 

80 

60 

40 

20 

0 

0 1000 2000 3000 4000 

Angular speed [rpm] 

(a) (a) 


Two algorithms for enhancing the maximum starting 

torque and steady-state efficiency of a PM motor were 

investigated. In both methods, stator current components 

have been used, transient and steady-state performance of 

a PMSM have been improved by closed-loop control 

strategy and application of the two algorithms. These 

algorithms are independent of the PM motor type. The 

simulation results shown that by applying two algorithms, 

performance of the motor is considerably improved 

compared to that of the conventional control methods. By 

applying the stator current limiting method during the 

starting period, the stator winding insulation is prevented 

against the damage due to high current. The LMA method 

improves the steady-state efficiency of the motor up to 

3% at the rated load and T/A method causes the increase 

of the starting torque up to 4 times of the rated torque. 

Therefore, by applying the proposed control methods in 

addition to providing a high starting torques without the 

risk of the short circuit of the windings; the extra losses 

arising from the imprecise control of electrical motors can 

be prevented. 

AKNOWLEGEMENT 

We sincerely thank the Iran’s National Elites 

Foundation (INEF) for financial support of the project. 

VII. REFRENCES 

[1] S.Morimoto, Y.Tong, Y.Takeda, and T. Hirasa, “Loss minimization 

control of permanent magnet synchronous motor drives,”IEEE 

Transactions on Industrial Electronics, vol. 41, no. 5, pp. 511-517, 

Oct 1994. 

[2] C.Mademlis, L.Xypteras, and N.Margaris, “Loss minimization in 

surface permanent magnet synchronous motor drives",IEEE 

Transactions on Industrial Electronics,vol. 47, no. 1, pp. 115-122, 

Feb 2000. 

Comparision of the motor efficiencies at rated speed (4000) 

versus the load torque in the case of LMA and of id=0 controls 

id=0 

LMA 

id=0 

LMA 

30 

0 0.5 1 1.5 2 

Torque(N.m) 

(b) (b) 

Figure 7: Losses of motor versus (a) speed and (b) torque 

Figure 8: Efficiency of the motor versus (a)speed (b) 

torque 

[3] Sadegh Vaez,M.A.Rahman, "Adaptive Loss Minimization Control of 

Inverter Fed IPM Motor Drives".IEEE Power Electronics Specialists 

Conference, pp. 861-868, vo. 2, 1997. 

[4] T.M.Jahns, G.B.Kliman, T.W.Neumann, “Interior permnanent 

magnet synchronous motor for adjustable speed drives,” IEEE 

Transactions Industry Applications, vol. 22, no. 4, pp. 738-747, 

July/August 1986 

[5] C.Cavallaro, A.O.Tommaso, R.Miceli, and A.Raciti, “Efficiency 

enhansment of permanent magnet synchronous motor drives by 

online loss minimization approaches,”IEEE Transactions on 

Industry Applications, vol. 52, no. 4, pp. 1153-1160, August 2005. 

[6] J.S.Yim, S.K.Sul, B.H.Bae, N.R.Patel, and S.Hiti, “Modified current 

control schemes for high-performance permanent-magnet ac drives 

with low sampling to operating frequency ratio,” IEEE Transactions 

Industry Applications, vol. 45, no. 2, pp. 763-771, March/April 

2009. 

[7] S.Morimoto, H.Nakayama, and M.Sanada, “Sensorless output 

maximization control for variable-speed wind generation system 

using IPMSG,”IEEE Transactions on Industry Applications, vol. 

41, no. 1, pp. 60-67, Jan/Feb 2005. 

[8] T.Nakamura, S.Morimoto, m.sanada, and Y.Takada, “Optimum 

control of IPMSG for wind generation system,” IEEE Power 

Conversion Conference, Osaka, pp. 1435-1440, 2002. 

[9] C.Chris, G.R.Slemon, and R.Bonert, “Minimization of iron loss 

of permanent magnet synchronous machines,”IEEE Transactions 

on Energy Conversion, vol. 20, no. 1, pp. 121- 127, March 2005. 

[10] S.Morimoto, Y.Takeda, and T.Hirasa, “Current phase control 

methods for permanent magnet synchronous motors,”IEEE 

Transactions on Power Electronics, vol. 5, no. 2, pp. 133, April 

1990. 

[11] S.Morimoto, Y.Takeda, T.Hirasa, and K.Taniguchi, “Expansion of 

operating limits for permanent magnet motor by current vector 

control considering inverter capacity,”IEEE Transactions on 

Industry Applications, vol. 26, no. 5, pp. 866-871, Sep/Oct 1990. 

[12] K.Yamazaki, “Torque and efficiency calculation of an interior 

permanent magent motor considering harmonic iron losses of both 

the stator and rotor,” IEEE Transactions Magnetics,vol. 39, no. 3, 

pp. 1460-1463, May 2003. 

[13] R.S.Colby, and D.W.Novotny, “An efficiency-optimizing 

permanent magnet synchronous motor drive,”IEEE Transaction on 

Industry Applications, vol. 24, no. 3, pp. 462-469,May/June 1988. 

[14] A.Kusko, and D.Galler, “Control means for minimization of losses, 

in AC and DC motor drives,”IEEE Transactions on Industry 

Applications, vol. 19, no. 4, pp. 561-570 ,July/August 1983.


Core Losses Estimation Techniques in Electrical 

Machines with Different Supplies – A Review 

Jawad Faiz, A.M. Takbash and B. M. Ebrahimi 

Center of Excellence on Applied Electromagnetic Systems, School of Electrical and Computer Engineering, College of 

Engineering, University of Tehran, Tehran, Iran, Email: jfaiz@ut.ac.ir 

Abstract—In this paper, different methods for core losses estimation in ferromagnetic materials with non-sinusoidal supply 

are studied. At this end, the origin of the core losses in the aforementioned materials is addressed. Since magnetization 

excitation is the most effective factor upon the core losses, different core losses estimation with six general types of excitations 

are considered and features of these methods, their advantages and disadvantages are investigated. 

Index Terms—Core Loss, Finite Element Method, Hysteresis Loop, Steinmetz Equation. 


The major role of ferromagnetic materials in electrical 

machines leads to wide research towards a better 

realization of these materials and their characteristics. 

One of the most important features in these materials is 

their losses. Core losses are generated due to magnetic 

flux residual and eddy current. From physics point of 

view, these two factors have an identical origin; they are 

movement of magnetic domains walls as well as internal 

movement of the magnetic domains. The magnetic 

residual flux is a well-known phenomenon. When an 

external magnetic field is applied to a ferromagnetic 

material, magnetic dipoles try to align with this external 

field. Even after removing the external magnetic field, 

some magnetic domains preserve their alignments and 

such case the material is magnetized. By changing the 

magnetic field a small amount of energy is stored in the 

material due to existing residual flux. The level of this 

stored energy depends on the material type. If a conductor 

is imposed on a varying magnetic field or moved 

appropriately in the magnetic field, eddy current is 

induced in the conductor. Any variation in magnetic field 

that causes the movement of the magnetic domains walls 

is a factor inducing eddy current. Eddy current generates 

heat and electromagnetic forces. For dc excitation, there 

are residual and eddy current losses as well. The reason 

for the residual losses is the internal movement of the 

magnetic domains which itself generates microscopic 

currents [1]. Based on these two factors, the core losses 

have been classified into two classes [2]. Some categorize 

the core losses into three classes [3], in which the third 

class is stray losses due to the external factors such as 

external magnetic fields. Two factors influence the core 

losses. The first factor depends on the magnetic material 

alloy, for instance rising Si content in SiFe magnetic alloy 

reduces the eddy current losses. The second factor is the 

external factor. Magnetic properties of magnetic materials 

are affected by cutting, pressing and welding processes. 

For instance, cutting or welding process of magnetic 

sheets can increase the residual losses of the sheets. 

Pressing the sheets causes the increase of the eddy 

currents. One of methods for reduction of the eddy 

current is using thin sheets. However, this leads to higher 

residual losses [4], [5]. Another external factor affecting 

the core losses is the magnetizing excitation. In the case 

of sinusoidal excitation, the well-known classic Steinmetz 

equations can be used for core losses estimation: 

P k f B k f B 

(1) 

x 2 2 

ir h s m e s m. 

where Pir is the core losses, fs is the supply frequency, Bm 

is the magnetic flux density magnitude and kh, ke, xare the 

Steinmetz factors. Such equations leads to error in the 

case of non-sinusoidal excitation and new equations must 

be introduced. Application of different drives with 

various switching patterns and also internal faults in 

electrical machines leads to non-sinusoidal excitation. 

Wide application of inverter-fed electrical machines and 

different faults such as rotor broken bars and eccentricity 

are important research topics in recent years. Therefore, 

core losses estimation in the magnetic core over such noncommon 

conditions is important. The trend of core losses 

estimation can be classified as shown in Figure 1. In this 

classification six general methods has been introduced 

which will be discussed in this paper. 

II. FINITE-ELEMENT-BASED METHODS 

Finite element methods (FEM) are time consuming and 

high precision techniques that take into account the 

geometry and physics of the machine. There are three 

steps in modeling electrical machines. They include 

geometrical modeling of motor considering physical 

characteristics, modeling motor supply considering 

electrical characteristics and finally modeling the motor 

load taking into account mechanical features. FEMs 

provide magnetic field distributions within induction 

motor based on its geometrical and magnetic parameters. 

Other quantities such as air gap flux density can be also 

estimated using the magnetic field distribution. In FEM, 

the coupling between electrical and magnetic fields and 

motor rotation can be taken into account. Losses in 

different parts of the machine can be estimated having the 

magnetic field distribution in various sections of the 

motor. A new method has been introduced in [6] for core 

losses estimation in a no-load motor with direct-fed and 

PWM-fed motor using 2D time stepping FEM in which 

the simulation results have been compared to the

Steinmetz 

Eq. 

Hysteresis 

Model 

FEM 


Iron Loss 

Calculation 

Methods 

Equivalent 

Circuit 

Physical Eq. 

Figure 1: Different methods for core losses estimation in non-sinusoidal excitation 

experimental results. To include stray losses and 

rotational residual losses, a modification factor has been 

considered in the losses calculation process. This 

modification factor depends on the peak magnetic flux 

density and its distortion. In [7], a new model for 

laminated core of induction motor has been presented 

based on 3D FEM. This modeling method is based on the 

reduced magnetic potential equations and used to estimate 

the core losses with non-sinusoidal excitation caused by 

PWM application. The results are more precise than that 

of the 2D FEM. In [8], the authors emphasize the need for 

precise data in core losses estimation from magnetic 

fields, therefore 2D FE model is applied; impact of nonsinusoidal 

supply and its harmonics on the rotor magnetic 

field has been investigated and the well-known integral 

equations have been then used to estimate the core losses. 

Core losses estimation in rotating electrical machines is 

more complicated than that of the static machine because 

of more complicated structure and rotating magnetic 

fields [9]. So, FEM modeling has been modified in order 

to include the stray core losses due to magnetic field 

rotating vector and its harmonics. At this end, two types 

of PM motors with two different structures have been 

modeled using a new method and the simulation results 

have been compared to the experimental results. In [10], 

impact of the stator slot shapes upon the core losses has 

been discussed and three different structures of induction 

motor for minimizing the core losses have been 

investigated. In this case, core losses are estimated using 

magnetic flux density and field intensity and integration 

of their product. Core losses distribution over core crosssection 

and impact of different stator slot shapes has been 

considered using FEM. In [11], a model based on eddy 

current analysis in the magnetic sheets is presented which 

is capable to estimate the stray losses for computation of 

high frequency core losses and residual losses in core 

sheets of electrical machines. Advantages of this method 

is taking into account the magnetic field distribution 

along sheets thickness using one-dimensional non-linear 

FEM over non-linear 2D elements and also impact of 

frequency and magnetic flux density on the quantities 

related to the core losses. In [12], a model has been 

introduced to study the impact of PWM supply on the 

induction motor core losses. Triple losses of the core have 

been presented by a combined model using FEM. The 

results have been compared to the traditional modeling 

and experimental results. This combined model consists 

of two static and dynamic models in which the impact of 

Control 

Strategy 

the residual minor loops has been also included. In 

addition, the losses distribution over the motor and 

separation of different components of the core losses has 

been pointed out. In [13], induction motor performance 

has been analyzed using 2D FEM. This is one of the few 

works in which the impact of internal fault of induction 

motor such as rotor broken bars and eccentricity and also 

application of the PWM drive upon core and Ohmic 

losses have been considered and shown that the rotor 

broken bar causes the increase of the losses around the 

damaged bar; in addition PWM supply also increases the 

core losses density. In [14], Permanent magnet (PM) 

motor under three static, dynamic and mixed 

eccentricities faults have been analyzed using FEM. 

Finally, impact of these faults on core and Ohmic losses 

has been investigated. Figure 2 shows the impact of 

different eccentricities on the eddy current and residual 

losses. 

III. HYSTERESIS LOOP MODEL BASED METHODS 

Precise mathematical model of hysteresis loop in 

magnetic material could be useful in accurate estimation 

of the core losses. First, classical hysteresis models such 

as [15] have been used to calculate the losses. Recently, 

improved hysteresis model such as loss surface model 

(LSM) and energy-based hysteresis vector-model have 

been employed to estimate the losses which are briefly 

described below. The LSM is a numerical and dynamic 

model for core losses evaluation which has been applied 

to thick magnetic laminations in [16]. This method is 

based on the definition of the magnetic field as a surface 

function of magnetic flux density and its rate as follows: 

dB dB 

S H( B, ) Hstat ( B) Hdyn ( B, 

). (2) 

dt dt 

In fact, this method is combination of a static and 

dynamic model and is capable to model the static and 

dynamic behaviors of the hysteresis loop. The static 

behavior is modeled using different hysteresis curves and 

dynamic behavior using six parameters which depend on 

the magnetic flux density and time variation of its 

derivative. Vector magnetic hysteresis model has been 

used in [17] to estimate the core losses. In this method, 

magnetic field intensity has been evaluated using the 

vertical components of the magnetic flux density and then 

core losses have been calculated by integration of the 

product of the magnetic flux density and field intensity.


Figure 2: (a) Hysteresis losses and (b) eddy current losses for different degrees and types of eccentricity [14] 

Application of two Preisach and Jill-Atherton models 

have been compared in three different magnetic materials 

in order to obtain an optimal method for precise modeling 

of the magnetic cores using FEM with reasonable 

computation time [18]. In mathematical models the full 

hysteresis loop and a series of the magnetic parameters of 

the proposed material must be available which 

complicated its application [19] 

IV. STEINMETZ EQUATIONS-BASED METHODS 

Purpose of the improved Steinmetz equations is to 

estimate core losses for non-sinusoidal magnetic flux 

density analytically. This modification is done using 

different methods. Improved Steinmetz equations have 

been employed to estimate the core losses in switched 

reluctance motor (SRM) [20], [21]. First SRM behavior is 

analyzed using FEM and then improved core losses 

equations for SRM including the impact of the minor 

hysteresis loops beside of current harmonics effect are 

considered. 

ab. B 1 dB 

max 

2 

Pc kcfChfBmax C ( ) . 

2 e avg (3) 

2 

dt 

where Kcf is the modification factor that takes into 

account the impact of the minor hysteresis loops within 

the major loops. Another idea for improving the 

Steinmetz classical equations is obtaining an equivalent 

frequency for proposed non-sinusoidal signal which have 

been used for magnetic sheets [22] and transformer [23]: 

2 

2 

T dB 

feq 

( ) dt. 

2 2 

( B 0 

max Bmin ) (4) 

dt 

where Bmax and Bmin are the maximum and minimum of 

the magnetic flux density. In the other words, remagnetizing 

frequency is substituted by an equivalent 

frequency versus magnetic flux density variations. In 

addition the impacts of dc upon the core losses have been 

considered in [22] and the modified Steinmetz equations 

(MSE) have been introduced. Another method of 

modifying Steinmetz equations is introducing the 

coefficients in order to take into account the nonsinusoidal 

magnetic excitation waveform. In this case, the 

distorted magnetic flux density waveform is used to 

determine these coefficients. For instance, in [24] the 

Steinmetz equations have been changed as such that the 

hysteresis losses versus the mean value of the rectified 

waveform of the magnetic flux density and eddy current 

losses versus the rms value of this waveform have been 

expressed. Consequently, these coefficients are estimated 

when the magnetic flux density waveforms in two directfed 

and PWM-fed are known and the core losses in 

abnormal operation are expressed versus the core losses 

in the normal operation. In [25], the traditional Steinmetz 

equations and their modification have been used to 

estimate the magnetic sheets losses under PWM-fed; such 

that the impact of frequency and magnetic flux density 

variations upon the Steinmetz equation have been 

included in the coefficients for different materials and 

also impact of the magnetic flux density waveform 

variations due to drive on the core losses. In order to 

consider the frequency and magnetic flux density on the 

Steinmetz coefficients, these coefficients are considered 

as 3 rd order equations versus magnetic flux density. 

V. PHYSICS-DEFINED LOSSES BASED METHODS 

A long time ago, there was a procedure for core losses 

estimation based on the physics definitions of various 

core losses. For instance in [26], a modeling method was 

introduced for magnetic domains in material and then 

eddy current, its cause and losses were discussed. 

Advantage of this model is its capability to use over wide 

range of the flux density up to the saturation level and 

wide frequency band. In [27], forming the eddy current in 

the magnetic sheets and external magnetic fields effect 

has been considered. Meanwhile, impacts of the internal 

magnetic fields (adjacent magnetic domains walls) on the 

eddy current have been included. In [28], core losses as 

non-linear function of frequency are expressed as three 

types of hysteresis, classic and stray losses. The classic 

and stray losses are as follows: 

( class) 

2 2 2 2 

P d 

I max 

fm/6. 

(5) 

( exc) 2L 

( class) 

P (1.63 ) P . 

(6) 

d 

where is the conductivity, d is the magnetic sheet 

thickness and L is the magnetic domain dimensions. Core 

losses have been categorized into two types: 1: Core 

losses constant against frequency (hysteresis losses), and 

2: core losses depending on frequency (eddy current 

losses and abnormal stray losses) [29]. Each category has 

been expressed by equations versus frequency and 

magnetic flux density. This method is not a precise 

method, the main reason is the classification of the losses 

versus dependency and independency on the 

frequency. In addition, this method has appropriate results

Figure 4: various losses in the mains-fed and inverter-fed 

induction motor [31] 

over particular amplitude of the magnetic flux density due 

to the simplification of the method. In [30], [31], core 

losses have been divided into hysteresis losses, eddy 

current losses, and stray eddy current losses. Classic eddy 

current losses and stray eddy current losses are as follows: 

2 

d 1 T dB 

Wc( ) dt. 

f 12m 

T (7) 

0 dt 

v 

1 1 T dB 

W GV 

S dt. 

(8) 

c 

0 

fm 0 

v T dt 

where mv is the magnetic material density. In the stray 

eddy currents losses S is the magnetic sheet cross-section, 

G is the dimensions factor and V0 is a parameter that 

determines the local fields distribution. Then an 

equivalent frequency is defined to estimate the losses 

based on physics definition taking into account the 

harmonics within the core magnetic flux density. Figure 4 

shows various losses in the mains-fed and inverter-fed 

induction motor. Also in [32], integrally defined of triple 

subdivision of core losses has been used to estimate the 

core losses. In this case, losses equations have been 

presented versus mmf for different emf (sinusoidal, 

triangular and rectangular) waveforms considering 

relationship between emf and magnetic flux density and 

piecewise-linearized modeling of emf waveform. The 

relevant equations due to different losses have been given 

for each waveform. In addition, impact of different 

parameters such as duty cycle upon various core losses 

has been investigated. In [33], the influence of minor 

hysteresis loops within the magnetic sheet hysteresis loop 

upon core losses estimation has been investigated and 

their complicated time variations versus peak magnetic 

flux density and magnetic polarization vector for 

estimation of the triple core losses have been presented. 

Figure 5 shows that how these minor hysteresis loops are 

generated. Variations in the magnetic polarization 

envelop causes minor hysteresis loops within the major 

hysteresis loop. Dividing the core losses into three 

different losses and integral definitions versus magnetic 

flux density, its derivative lead to complicated 

computations. An important point in these methods is 

their dependency on the coefficients which depend on the 

physical and chemical characteristics of the proposed 

material and its molecular structure; however, they are not 

1.5 


Figure 5: Generating minor hysteresis loops within major 

hysteresis loop [33]. 

Figure 6: dq equivalent circuit considering core losses [36]. 

available and their computations need some tests, 

therefore application of these methods is difficult. In 

[34], eddy current losses have been evaluated by 

application of different double-magnetic-excitation on the 

magnetic sheets using Maxwell equations. These 

computations have been carried out on a steel sheet and 

can be extended to the whole electrical machine. In 

addition, computations results have large difference with 

the test results and no justification has been given. Also 

application of the complicated Maxwell equations is an 

important problem particularly in the selection of the 

numerical solution method for solution of the equations. 

In [35], superposition method has been applied to 

estimate the eddy current losses in the PWM-fed motor 

and transformer. Since different losses do not vary versus 

magnetic flux density, application of the superposition is 

not correct. 

VI. EQUIVALENT CIRCUIT-BASED METHODS 

Equivalent circuit of induction motor has been 

frequently used to analyze the motor behavior and find its 

core losses. As shown in Figure 6, at this end a simplified 

dq model of ac motors can be used which does not lead to 

accurate results [36]. In addition, in dq model harmonics 

are ignored and this decreases the precision of the 

method. In [37], an equivalent circuit with a 

harmonic supply has been introduced to estimate core 

losses. In this case, dq model has been used and 

parameters of the equivalent circuit are calculated using 

FEM. In [38], impact of unbalanced supplied induction 

motor on the motor efficiency has been presented. The 

proposed equivalent circuit of induction motor is not 

accurate, so it has been modified by layered magnetic 

core. To do so, a resistance with leakage inductances has 

been connected in parallel. The major point in the

Method 

Steinmetz Eqn. 

MSE 

Hysteresis model 

Physical Eqn. 

FEM 

Equivalent Circuit 

Control Strategy 

application of the equivalent circuits for core losses 

estimation is its strong dependency on the parameters 

which may vary due to the operating conditions. 

VII. CONTROL STRATEGY-BASED METHODS 

Since efficiency of electrical machine is an important 

factor beside its life, one of the major quantities aiming to 

reduce is the core losses of drive-fed motor. In [39], [40], 

strategies for squirrel-cage induction motor and PM 

synchronous motor control have been introduced to 

decrease the losses. In these strategies, all equations for 

minimizing the losses depend on the motor parameters 

and determination of these parameters is themselves 

critical. In [41], impact of the core losses of induction 

motor on the stator-flux oriented control has been studied 

and a control strategy taking into account the induction 

motor losses has been introduced. In this method, core 

losses have been modeled by a resistance in parallel with 

the magnetizing inductance. In [42], a vector controlbased 

strategy for PMSM has been presented to maximize 

the motor efficiency. At this end, a model has been 

suggested for losses. In this method, d-component of the 

stator current is determined to maximize the IPM motor 

efficiency. In this strategy, losses are divided into Ohmic, 

core, mechanical and harmonic losses and they are then 

calculated. 

This test-based method can be extended to different 

types of PM and reluctance motors. However, they 

depend strongly on the motor parameters while these 

parameters are not in turn constant and vary by changing 

the operating point of the motor. Table I summarizes the 

core losses estimation for different supplies. In this table, 

some factors such as complexity of the methods, need for 

magnetic material parameters, response to the nonsinusoidal 

excitation waveforms and precision of the 

methods have been compared. As seen in Table 1, some 

accurate methods are complicated and need huge data 

from the magnetic material. Some methods are simple 

with low accurate responses. So, a proper method must be 

selected based-on the application. 

VIII. CONCLUSION 

Different methods were proposed for core losses 

estimation in magnetic materials for different excitations. 

These methods were classified and studied. Some factors 

such as complexity of methods for magnetic material 

parameters estimation, response to non-sinusoidal 


TABLE I 

COMPARISON OF DIFFERENT CORE LOSSES ESTIMATION METHODS 

Complex waveform Complexity Material knowledge 

- 

Low 

Low 

+ 

Low 

Low 

+ 

High 

High 

+ 

High 

High 

+ 

High 

Medium 

- 

Medium 

Low 

+ 

Medium 

Low 

Accuracy 

Low 

Medium 

Good 

Good 

Good 

Low 

Medium 

excitation waveforms and precision for each method were 

investigated and summarized. The methods such as FEM, 

hysteresis model, physical equation lead to accurate 

results but they are complicated methods and need a huge 

data of the magnetic material. MSE method is simple and 

no need huge data of the magnetic material with average 

accuracy. Control strategies are not used in direct core 

losses estimation. Equivalent electrical circuits of motor 

parameters also depend on the operating point of the 

motor and this is considered a difficulty of these methods. 


The authors would like to thank Iran’s National Elites 

Foundation (INEF) for financial support of the project. 

[1] C. D. Graham, 

REFERENCES 

"Physical origin of losses in conducting 

[2] 

ferromagnetic materials," Journal of Applied Physics, vol. 53, no. 

11, pp. 8276-8280, Nov. 1982. 

M. S. Lancarotte, Jr. Aderbalde, and A. Penteado, "Estimation of 

core losses under sinusoidal or non-sinusoidal induction by 

analysis of magnetization rate,” IEEE Transactions on Energy 

Conversion, vol.16, no. 2, pp. 174-179, June 2001. 

[3] M. Amar, and F. Protat, "A simple method for the estimation of 

power losses in silicon iron sheets under alternating pulse voltage 

excitation," IEEE Transactions on Magnetics, vol. 30, no. 2, pp. 

942-944, March 1994. 

[4] W. Arshad, T. Ryckebusch, F. Magnussen, H. Lendenmann, B. 

Eriksson, J. Soulard, and B. Malmros, "Incorporating lamination 

processing and component manufacturing in electrical machine 

design tools," 42nd IAS Annual Meeting of Industry Applications 

Society, pp. 94-102, 2007. 

[5] H. Skarrie, "Design of powder core inductors," Licentiate Thesis, 

Lund University, Sweden, 2001. 

[6] Z. Gmyrek, A. Boglietti, and A. Cavagnino, "Estimation of iron 

losses in induction motors: calculation method, results, and 

analysis," IEEE Transactions on Industrial Electronics, vol. 57, 

no. 1, pp. 161-171, January 2010. 

[7] K. B. Tatis, A. G. Kladas, and J. A.Tegopoulos, "Harmonic iron 

loss determination in laminated iron cores by using a particular 3- 

D finite-element model," IEEE Transactions on Magnetics, vol. 

40, no. 2, pp. 860- 863, March 2004. 

[8] A. M. Knight, and Y. Zhan, "Identification of flux density 

harmonics and resulting iron losses in induction machines with 

nonsinusoidal supplies," IEEE Transactions on Magnetics, vol. 

44, no. 6, pp.1562-1565, June 2008. 

[9] L. Ma, M. Sanada, S. Morimoto, and Y. Takeda, "Iron loss 

prediction considering the rotational field and flux density 

harmonics in IPMSM and SynRM," IEE Proceedings - Electric 

Power Applications, vol. 150, no. 6, pp. 747- 751, November 

2003. 

[10] M. Enokizono, H. Shimoji, and T. Horibe, "Effect of stator 

construction of three-phase induction motors on core loss," IEEE 

Transactions on Magnetics, vol. 39, no. 3, pp. 1484- 1487, May 

2003. 

[11] K. Yamazaki, and N. Fukushima, "Experimental validation of 

iron loss model for rotating machines based on direct eddy current

analysis of electrical steel sheets," IEEE International Conference 

on Electric Machines and Drives. IEMDC '09, pp. 851-857, 3-6 

May 2009. 

[12] E. Dlala, and A. Arkkio, "A General model for investigating the 

effects of the frequency converter on the magnetic iron losses of a 

squirrel-cage induction motor," IEEE Transactions on Magnetics 

, vol. 45, no. 9, pp. 3303-3315, Sept. 2009. 

[13] J. F. Bangura, and N. A. Demerdash, "Effects of broken bars/endring 

connectors and air gap eccentricities on Ohmic and core 

losses of induction motors in ASDs using a coupled finite 

element-state space method," IEEE Transactions on Energy 

Conversion, vol. 15, no. 1, pp. 40-47, Mar. 2000. 

[14] B. M. Ebrahimi, and J. Faiz, "Diagnosis and performance 

analysis of three phase permanent magnet synchronous motors 

with static, dynamic and mixed eccentricity," Electric Power 

Applications, IET, vol. 4, no. 1, pp. 53-66, January 2010. 

[15] D. Jiles, and D. Atherton, "Theory of ferro-magnetic hysteresis," 

Journal of Magnetism and Magnetic Materials, vol. 61, no. 1-2, 

pp. 48-60, Sep.1986. 

[16] T. Chevalier, A. Kedous Lebouc, B. Cornut, and C. Cester, "A 

new dynamic hysteresis model for electrical steel sheet," Physica 

B: Condensed Matter, vol. 275, no. 1-3, pp. 197-201, 2000. 

[17] M. Enokizono, and T. Horibe, "Loss evaluation of induction 

motor by using E&S2 model magnetic hysteresis," IEEE 

Transactions on Magnetics, vol. 38, no. 5, pp. 2379- 2381. 

September 2002. 

[18] A. Benabou, S. Clenet, and F. Piriou, "Comparison of Preisach 

and Jiles–Atherton models to take into account hysteresis 

phenomenon for finite element analysis," Journal of Magnetism 

and Magnetic Materials, no. 261, pp. 139–160, 2003. 

[19] A. Krings, and J. Soulard, "Over view and comparison of iron 

loss models for electrical machines," Ecological Vehicle and 

Renewable Energy (EVER), Monaco, 25-28 March 2010. 

[20] B. Ganji, J. Faiz, K. Kasper, C. E. Carstensen, and R. W. 

DeDoncker, "Core loss model based on finite-element method for 

switched reluctance motors," IEE Proceedings - Electric Power 

Applications, vol. 4, no. 7, pp. 569–577, 2010. 

[21] J. Faiz, B. Ganji, B. R. W. De Doncker, and J. O. Fiedler, 

"Electromagnetic modeling of switched reluctance motor using 

finite element method," 32nd Annual Conference on IEEE 

Industrial Electronics, IECON, pp. 1557-1562, 6-10 Nov. 2006. 

[22] J. Reinert, A. Brockmeyer, and R. W. De Doncker, "Calculation 

of losses in ferro- and ferromagnetic materials based on the 

modified Steinmetz equation," IEEE Transactions on Industry 

Applications, vol. 37, no. 4, pp. 1055-1061, Jul./Aug. 2001. 

[23] M. Albach, T. Durbaum, and A. Brockmeyer, "Calculating core 

losses in transformers for arbitrary magnetizing currents: a 

comparison of different approaches," 27th Annual IEEE Power 

Electronics Specialists Conference, PESC '97, vol. 2, pp. 1463- 

1468, 23-27 Jun 1996. 

[24] A. Boglietti, A. Cavagnino, M. Lazzari, and M. Pastorelli, 

"Predicting iron losses in soft magnetic materials with arbitrary 

voltage supply: an engineering approach," IEEE Transactions on 

Magnetics, vol. 39, no. 2, pp. 981- 989, Mar. 2003. 

[25] D. Ionel, M. Popescu, C. Cossar, M. I. McGilp, A. Boglietti, 

and A. Cavagnino, "A general model of the laminated steel losses 

in electric motors with PWM voltage supply," IEEE Industry 

Applications Society Annual Meeting. IAS '08., pp. 1-7, 5-9 Oct. 

2008. 

[26] R. H. Pry, and C. P. Bean, "Calculation of the energy loss in 

magnetic sheet materials using a domain model," Journal of 

Applied Physics, vol. 29, no. 3, pp. 532-533, Mar. 1958. 


[27] G. Bertotti, "Physical interpretation of eddy current losses in 

ferromagnetic materials. I. Theoretical considerations," Journal of 

Applied Physics, vol. 57, no. 6, pp. 2110-2117, March 1985. 

[28] G. Bertotti, "General properties of power losses in soft 

ferromagnetic materials," IEEE Transactions on Magnetics, vol. 

24, no. 1, pp. 621-630, Jan. 1988. 

[29] R. Newbury, "Prediction of loss in silicon steel from distorted 

waveforms," IEEE Transactions on Magnetics, vol. 14, no. 4, pp. 

263- 268, Jul. 1978. 

[30] M. Amar and R. Kaczmarek, "A general formula for prediction 

of iron losses under non-sinusoidal voltage waveform," IEEE 

Transactions on Magnetics, vol. 31, no. 5, pp. 2504-2509, Sep. 

1995. 

[31] T. C. Green, C. A. Hernandez-Aramburo, and A. C. Smith, 

"Losses in grid and inverter supplied induction machine drives," 

IEE Proceedings - Electric Power Applications, vol. 150, no. 6, 

pp. 712- 724, 7 Nov. 2003, 

[32] W. A. Roshen, "A practical, accurate and very general core loss 

model for no sinusoidal waveforms," IEEE Transactions on 

Power Electronics, vol. 22, no. 1, pp. 30-40, Jan. 2007. 

[33] E. Barbisio, F. Fiorillo, and C. Ragusa, "Predicting loss in 

magnetic steels under arbitrary induction waveform with minor 

hysteresis loops," IEEE Transactions on Magnetics, vol. 40, no. 

4, pp. 1810- 1819, July 2004. 

[34] J. Sagarduy, A. J. Moses, and F. J. Anayi, "Eddy current losses in 

electrical steels subjected to matrix and classical PWM excitation 

waveforms," IEEE Transactions on Magnetics, vol. 42, no. 10, 

pp. 2818-2820, Oct. 2006. 

[35] Ruifang Liu, C. C. Mi, and D. W. Gao, "Modeling of eddycurrent 

loss of electrical machines and transformers operated by 

pulse width-modulated inverters," IEEE Transactions on 

Magnetics, vol. 44, no. 8, pp. 2021-2028, Aug. 2008. 

[36] M. Popescu, D. G. Dorrell, and D. M. Ionel, "A study of the 

engineering calculations for iron losses in 3-phase AC motor 

models," IEEE 33rd Annual Conference of the Industrial 

Electronics Society. IECON 2007., pp. 169-174, 5-8 Nov. 2007. 

[37] K. Yamazaki, "Torque and efficiency calculation of an interior 

permanent magnet motor considering harmonic iron losses of both 

the stator and rotor," IEEE Transactions on Magnetics, vol. 39, 

no. 3, pp. 1460- 1463, May 2003. 

[38] A. Vamvakari, A. Kandianis, A. Kladas, S. Manias, and J. 

Tegopoulos, "Analysis of supply voltage distortion effects on 

induction motor operation," IEEE Transactions on Energy 

Conversion, vol. 16, no. 3, pp. 209-213, Sep. 2001. 

[39] S. Lim, and K. Nam, "Loss-minimizing control scheme for 

induction motors," IEE Proceedings -Electric Power 

Applications, vol. 151, no. 4, pp. 385- 397, 7 July 2004. 

[40] J. Lee, K. Nam, S. Choi, and S. Kwon, "Loss-minimizing control 

of PMSM with the use of polynomial approximations," IEEE 

Transactions on Power Electronics, vol. 24, no. 4, pp. 1071- 

1082, April 2009. 

[41] S. D. Wee, M. H. Shin, and D. S. Hyun, "Stator-flux-oriented 

control of induction motor considering iron loss," IEEE 

Transactions on Industrial Electronics, vol. 48, no. 3, pp. 602- 

608, Jun. 2001. 

[42] C. Mademlis, I. Kioskeridis, and N. Margaris, "Optimal 

efficiency control strategy for interior permanent-magnet 

synchronous motor drives," IEEE Transactions on Energy 

Conversion, vol. 19, no. 4, pp. 715- 723, Dec. 2004.


Fast Computation of Inductances and Capacitances of 

High Voltage Power Transformer Windings 

*Župan, Tomislav, *Štih, Željko and *Trkulja, Bojan 

*Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia 

E-mail: tomislav.zupan@fer.hr 

Abstract— Inductances and capacitances in analysis of fast transients in high voltage transformers are usually calculated on 

the basis of simple analytical approximations for applications in lumped circuit models. This paper presents the application 

of the boundary element method to calculation of capacitances and inductances of transformer windings with coils of 

rectangular cross section. Two-dimensional axially symmetric mathematical model of electric and magnetic field in 

transformer, which is based on integral equations, is introduced. The accuracy and applicability of the proposed approach is 

illustrated by an example. 

Index Terms—boundary element method, capacitance calculation, inductance calculation, rectangular cross section coils. 


Power transformers are one of the most important 

segments of electric power systems. Due to their long 

term running requirements they are faced with numerous 

overvoltage strikes during their lifetime. Therefore it is of 

great importance to know how the power transformer 

windings will react on such excitations. 

Since most of the overvoltage strikes are characterized 

by high frequency impulse signals (due to lightning 

discharges, short circuits, etc.), the capacitances between 

windings, which are negligible during normal operational 

frequency of power grid, become significant. Thus it is 

meaningful to have, alongside the strict calculation of 

inductances, a method for precise calculation of 

capacitances of high voltage power transformer windings. 

Distribution of voltage along windings of transformers 

due to fast excitations (switching, lightning, and testing) 

is one of the most important inputs to insulation design. 

Lumped circuit model of winding is most frequently 

applied in calculation of such distribution [1]-[3]. Parts of 

windings are represented by equivalent capacitances, 

inductances and resistances. Depending on the way of the 

connection of the conductors, the equivalent diagram is 

made for each of the turns or, for the sake of faster 

calculation, turns are grouped. 

There are numerous papers dealing with the presented 

problem by using the numerical approach based on the 

finite element method (FEM) [1], [4]. In this paper we 

introduce the application of the boundary element method 

(BEM) to calculation of capacitances and inductances of 

windings of transformers with coils of rectangular cross 

section. Main advantage of this approach is ease of use, 

because the discretization is done only at material 

interfaces and boundaries, thus effectively reducing the 

order of the mathematical model by one dimension. 

II. COMPUTATION OF CAPACITANCES 

Typical arrangement inside the high voltage power 

transformer can be seen in Fig. 1 (simplified). Windings 

usually consist of low voltage, high voltage and 

regulation voltage parts. Conductors are typically of 

rectangular cross section insulated by paper insulation 

and immersed in oil, grouped in discs with radial and 

axial canals for heat transfer purposes. 

z 

r 

LV 

HV RV 

core 

Fig. 1. Windings in high voltage power transformers 

Usual approach to calculation of capacitances is based 

on simple analytical parallel-plate approximations [2]: 

2 2 

Rout Rin 

CParallel 

0 

, 

(1) 

drc 

 

 

 

o p 

where o and p are relative permittivities of oil and 

paper insulation, respectively, drc height of the radial 

canal between conductors in z axis direction, width of 

insulation between conductors, in R and Rout conductor's 

inner and outer radius, respectively, 

cylindrical approximations [5]: 

or analytical 

20 

phc CSerial 

 

, 

Rav 

 

ln 2 

R 

av 

2 

(2) 

oil 

paper 

insulation 

conductor

where c h is conductor's height, and Rav average radius 

between two conductors. 

These analytical methods result in significant errors and 

consequently in unsatisfactory accuracy of computation 

of voltage distribution along the winding. 

Electromagnetic field in power transformers is 

transverse and the capacitances may be computed on the 

basis of electrostatic analysis. The electric field in a space 

composed of conducting regions at known potentials and 

regions filled with different dielectrics can be solved by a 

pair of coupled integral equations [6]. The potential (A) 

at any point A on the surface of conductor is related to the 

charge density (B) on total surface of all boundaries 

(conductor–dielectric and dielectric–dielectric) by the 

equation: 

A B G B, A dS 0, 

(3) 

 

 

S 

where: 

1 

GB, A 

, 

(4) 

4 d AB 

and dAB is the distance between points A and B. The 

charge density (D) at any point D on the dielectric– 

dielectric boundary is related to the charge density (B) 

on total surface of all boundaries by the equation: 

o 

 

i 

D2 BDNB, DndS 0. 

 

(5) 

o i S 

Here, n is the unit vector normal to the surface at the 

point D, o is the permittivity of the region in the 

direction of the normal unit vector on the surface at the 

point D and i is the permittivity of the region in the 

opposite direction. 

Geometry of the windings can be approximated with 

axially symmetry, and the problem becomes twodimensional. 

Therefore, we may integrate surface 

integrals in (3) and (5) with respect to circumferential 

direction and reduce them to line integrals. Kernels of 

integral equations are: 

1 r 

Prr ( , ) kKk 2 

r 

 

DN( r, r) DR( r, r) ar DZ( r, r) az 

 

 

 

1 

DR( r, r) 

 

4 r 

3 

r k 

 

2 r1k KkEk rr 

EkKk 2 

 

k 2r 

 

(6) 

1 

DZ( r, r) 

4r 3 

r zz k 

E k r 2 2r 1 

k 

2 

k 

4rr 

. 

2 2 

rr zz 

Here, k is the modulus of the elliptic integrals of the 

first and the second kind K(k) and E(k). The vector 

 

r ra r za 

z defines the position of the source point 

 

(B), and the vector r rar zaz 

defines the position of 

the point of interest (A, D). 


 

Terms Prr ( , ) , DR( r, r) 

and DZ( r, r) 

represent the 

kernels for electric potential and radial and axial 

components of electrical induction vector of uniformly 

charged ring with negligible cross section, respectively 

[7]. Because the conductors inside power transformers 

have rectangular cross section, their surface can be 

divided, using BEM approach, into either thin cylinders 

or thin discs. As can be seen on Fig.2, both of the cases, 

for the sake of generality, can be represented with 

truncated cone. 

truncated 

cone 

disc 

cylinder 

Fig. 2. General representation of the conductor segment 

division 

 

The final expressions for Prr ( , b, re) 

, after integrating 

(6) over l, are: 

1 2 

l r() t K( k ) 

Prr ( , b, re) 

dt 

q 

0 

2 2 

e b e b 

 

l z z r r 

rt () ( re rb) trb zt () ( ze zb) tzb, where r(t) and z(t) represent parametric notation of the 

general point on segment l, l is the length of the segment 

and q and k are: 

2 

 

2 2 

q r rt () zzt () 2 

rrt () 

4 rr( t) 

k . 

q 

 

 

From the equation E ,0, 

r z 

 

 

obtain the kernels of electrical induction vector: 

1 

l 

DR( r, rb, re) r( t) 

 

 

0 

2 2 

2() rtKk ()(1 k) Ek ()2() rt krrt () 

 

 

2 2 3 2 

k (1 k ) q 

 

1 

l rt () zzt () Ek 

( ) 

DZ( r, rb, re) 

dt. 

2 3 2 

(1 k ) q 

0 

z 

z 

ze 

zb 

The system of integral equations (3) and (5) is solved 

by BEM. Boundaries and interfaces between two 

dielectrics are divided into finite segments and the 

l 

rb re r 

dt 

r 

(7) 

(8) 

we 

(9)

unknown distribution of surface charge density on the ith 

segment is approximated as linear combination of 

predefined basis functions: 

N 

 

( r) 

t ( r). 

(10) 

i in in 

i1 

The simplest approach is to use basis functions which 

are constant (N=1) on the finite segment. The application 

of (10) to (3) and (5) results in: 

N 

 

( r) P( r, r, r) dC ; r C 

 

0 i p k i 0 

i1 Ci 

N 

o i 

 

i( r) 2 i 

DN( r, rp, rk) dCi; o 

i i1 Ci 

 

r Ci. 

(11) 

Here, C0 is boundary at known potential 0 and Ci is 

interface between two dielectrics. A linear equation 

system for unknown coefficients i is derived by 

enforcing an exact solution at midpoints of each finite 

segment (point-matching [8]). The integrals in (11) are 

Ci, 

the integrals become singular. In such a case their vicinity 

is treated separately and this contribution is calculated 

analytically (logarithmic singularities). 

After the determination of the surface charge 

distribution 

calculated by: 

on conductors, the capacitances are 

Qij 

Cij ; i j. 

 

(12) 

i j 

Here, i and j are potentials of i-th and j-th conductor, 

respectively, and Qij is total charge on the j-th conductor 

influenced by the charge on the i-th conductor: 

j 

 

 

Q dS S 

ij j j kj kj 

S 

k 1 

j 

N 

, 


(13) 

where kj is surface charge density on k-th segment of jth 

conductor, Nj is the number of finite segments on j-th 

conductor and Skj is the surface of the k-th segment of jth 

conductor. 

We calculate the capacitances by setting the potential of 

the i-th conductor to 1V and the potential of all other 

conductors to zero. Then, we obtain the total charge on 

conductors and use (12) to calculate the capacitances. 

III. CAPACITANCE CALCULATION -EXAMPLE 

The following procedure has been tested on two 

examples, first one showing the calculation of turn-toturn 

capacitances of high voltage winding in a power 

transformer and the second one showing a more 

"macroscopic" approach, where the conductors in one 

row of high voltage winding are grouped into disc and 

then the disc-by-disc capacitances are observed. 

The turn-to-turn capacitances were calculated in three 

ways: 

BEM approach. Total number of unknown 

coefficients of surface charge distribution was 

896. 

FEM approach using Ansoft Maxwell ® 

package. 

Total number of elements was 36180, and total 

number of nodes was 2103, which results in 0.1% 

error in calculation of energy. 

Analytical approach based on cylindrical 

approximation shown in (2) for calculation of 

serial capacitance between radially neighboring 

conductors or parallel-plate approximation shown 

in (1) for calculation of parallel capacitance 

between axially neighboring conductors. 

Following proposed BEM approach, graphical 

depiction of one example where the middle conductor's 

potential is set to 1V, illustrating the distribution of the 

surface charge density on the observed conductor and the 

influenced surface charge densities on neighboring 

conductors can be seen on Fig. 3. 

Fig. 3. BEM solution for calculation of turn-to-turn 

capacitances 

The same example was solved using Ansoft Maxwell ® 

package, as can be seen on Fig. 4. 

Fig. 4. FEM solution for calculation of turn-to-turn 

capacitances (Ansoft Maxwell ® ) 

Comparison of the results for turn-to-turn capacitance 

calculation of various approaches is given in Table 1. CiS 

and CiP represent the serial and parallel capacitance 

between two innermost conductors, CS and CP between

two middlemost conductors, and CoS and CoP between 

two outermost conductors. 

TABLE I 

TURN-TO-TURN CAPACITANCE RESULTS 

turn-to- CiS CiP CS CP CoS CoP 

turn [nF] [nF] [nF] [nF] [nF] [nF] 

® Maxwell 1.70 0.09 1.81 0.03 1.93 0.10 

Analytical 1.56 0.04 1.66 0.04 1.76 0.05 

BEM 1.68 0.09 1.77 0.03 1.89 0.10 

The disc-to-disc example results solved by BEM can be 

seen in Fig. 5 and the same example solved using Ansoft 

Maxwell ® package can be seen in Fig. 6. 

Fig. 5. BEM solution for calculation of disc-to-disc 

capacitances 

Fig. 6. FEM solution for calculation of disc-to-disc 

capacitances (Ansoft Maxwell ® ) 

Comparison of the results for disc-to-disc capacitance 

calculation is given in Table 2. 

TABLE II 

DISC-TO-DISC CAPACITANCE RESULTS 

disc-to- Cbottom Cmid1 Cmid2 Ctop 

disc [nF] [nF] [nF] [nF] 

® 

Maxwell 2.83 2.88 2.90 2.83 

Analytical 2.36 2.36 2.36 2.36 

BEM 2.81 2.73 2.79 2.75 

The cumulative results show significant errors in 

analytical approach and justify the necessity of 

application of numerical approaches. Even in the case of 

very coarse discretization of the BEM approach, the 


results differ by only a few percents from the results 

obtained by FEM. 

IV. COMPUTATION OF INDUCTANCES 

Using the analogy introduced in capacitance 

computation, inductances can be computed on the basis 

of magnetostatic analysis. The magnetic field in space 

composed of conducting regions with known currents and 

regions filled with different magnetic materials can be 

solved by a pair of coupled integral equations. Due to the 

linearity of the computation, by imposing the constant 

magnetic permeability, the magnetic vector potential and 

magnetic field strength can be written as: 

 

Ar ( ) AM( r) AS( r) 

(14) 

Hr ( ) HM( r) HS( 

r), 

where A 

is total magnetic vector potential, AM 

is 

magnetic vector potential caused by surface 

magnetization current density KM 

and S A is magnetic 

vector potential caused by imposed current density S J . 

The same subscripts and definitions are valid for 

magnetic field strength. 

 

Magnetic vector potential AA ( ) at any point A on the 

surface that restricts the model is related to the surface 

 

magnetization current density K( B) 

on total surface of 

all boundaries by the equation: 

 

 

AA ( ) KBGBAdS ( ) ( , ) A( 

A), 

(15) 

 

S 

where GBA ( , ) is written in equation (4). The surface 

 

magnetization current density K( D) 

at any point D on 

the boundary of two different magnetic materials is 

related to the surface magnetization current density 

 

K( B) 

on total surface of all the boundaries by the 

equation: 

o 

i 

K( D) 2 K( B) HT( B, D) d S 

o i 

S 

(16) 

o 

i 

2 HS( D) n. 

 

o i 

Here, n is the unit vector normal to the surface at the 

point D, o is the permeability of the region in the 

direction of the normal unit vector on the surface at the 

point D and i is the permeability of the region in the 

opposite direction. 

Assuming the same simplification as in the capacitance 

calculation, geometry of the windings can be 

approximated with axially symmetry, and the problem 

becomes two-dimensional so we may integrate surface 

integrals in (15) and (16) with respect to circumferential 

direction and reduce them to line integrals. Kernels of 

integral equations are: 

2 

1 r 1 k 

 

Grr ( , ) 1 Kk 

( ) Ek 

( ) 

rk 

2 

 

 

(17) 

 

HT ( r, r) HR( r, r) a HZ( r, r) a n, 

 

S 

r z

where Grr ( , ) is the kernel for magnetic vector potential 

 

 

and HR( r, r) 

and HZ( r, r) 

are kernels for radial and 

axial components of magnetic field strength: 

k zz HR( r, r) K( 

k) 

 

4 

rr 

r 

2 2 

2 

r r zz 

 

Ek ( ) 

2 2 

rr zz 

 

k 

HZ( r, r) K 

( k) 

 

(18) 

4 

rr 

2 2 

2 

r r zz 

 

Ek ( ) 

2 2 

rr zz 

 

2 4rr 

k 

. 

2 2 

rr zz Here, k is the modulus of the elliptic integrals of the 

first and the second kind K(k) and E(k). The vector 

 

r ra r za 

z defines the position of the source point 

 

(B), and the vector r rar zaz 

defines the position of 

the point of interest (A, D). 

The system of integral equations (15) and (16) is solved 

by BEM using the same technique mentioned in 

capacitance calculation section above, dividing the 

interfaces between different magnetic materials into finite 

segments and approximating the unknown distribution of 

surface magnetization current density with linear 

combination of predefined basis functions: 

N 

 

Ki( r) Kint in( 

r). 

(19) 

i1 

Again, using the simplest adequate approach, the basis 

functions are constant on the finite segment (N=1). The 

application of (19) to (15) and (16) results in: 

N 

 

Ar ( ) K Gr ( , r) dC A( r); rC 

K r 

 

0 i i S 

0 

i1 Ci 

 

 

 

 

 

HrS ( r) arHzS ( r) azn; r Ci 

N 

i( ) o i 

 

 

Ki HT( r, r ) dCi 

2 o 

i i1 Ci 


(20) 

Here, C0 is boundary at known magnetic vector 

potential and Ci is interface between two magnetic 

materials. Using the point-matching technique, a linear 

equation system for unknown coefficients Ki is derived. 

The example of the distribution of surface magnetization 

current density on the core of the transformer is shown in 

Fig. 7. 

As can be seen through inspecting equations (15) and 

(16), it is still necessary to determine the magnetic vector 

 

potential contribution of imposed current density AS( r) 

and their radial and axial components of magnetic field 

strength HrS ( r) and HzS ( r) . 

The calculation for magnetic vector potential and 

magnetic field strength of circular conductor of 

rectangular cross section have been done in [9] and are 

presented here for the completeness of proposed method. 

Fig. 7. Distribution of surface magnetization current density on 

transformer core boundaries 

Using Fig. 8. for clarity, the equations are: 

T1( R1, R2, r, zZ1) T1( R1, R2, r, Z2 z) 

 

; 

z Z1 

 

TA( 

R1, R2, r) T1( R1, R2, r, zZ1) AS 

 

T1( 

R1, R2, r, Z2 z); Z1 z Z2 

 

 

T1( R1, R2, r, Z2 z) T1( R1, R2, r, zZ1) 

; 

z Z2 

(21) 

T2( R1, R2, r, zZ1) T2( R1, R2, r, Z2 z) 

 

; 

z Z1 

 

TB( 

R1, R2, r) T2( R1, R2, r, zZ1) H zS 

T2( 

R1, R2, r, Z2 z); Z1 z Z2 

 

 

T2( R1, R2, r, Z2 z) T2( R1, R2, r, zZ1) 

; 

z Z2 

(22) 

H T ( R , R , r, Z z) T ( R , R , r, zZ ). 

rS 

3 1 2 2 3 1 2 1 

R1 R2 

r 

Fig. 8. Circular conductor with rectangular cross section 

The subfunctions TA, TB, T1, T2 and T3 are: 

 

0Ira 

R2 r T1( R1, R2, r, a) 

ln 

 

4 

 

 

R1r 2 2 

R2 r a 

 

2 2 

 

R1r a 

 

 

3 

2 

0IrR2 sin d 2 X( R , r, a, ) aX( R , r, a, 

) 

 

0 

z 

Z2 

Z1 

 

2 2 

I

3 

2 

0 1 

sin 

IrR d 

2 

 

X( R , r, a, ) a X( R , r, a, 

) 

 

 

0 

1 1 

 

0Ia 

co s X( R2, r, a, ) 2 

 

0 

X( R1, r, a, ) d 

 

0Ir rsin arctan 

2 

a 

0 

2 

 

R2 

 

X( R2, r, a, ) 2 

R 

1 

 

X( R1, r, a, 

) 

 

 

2 

0Ir 

a sincossin 

 

 

 

cossind 

 

4 R rcos 

X( R , r, 

a, 

) 

0 

2 2 

2 

2 

0Ir 

a 

1 

R 

1 d X( R2, r, a, ) 4 

R 

1 

 

X( R 

0 1, 

r, a, 

) 

 

sin cos sin 

, 

cos ( , , , ) d 

 

 

 

R r X R r a 

(23) 

1 1 

I Ia 

T2( R1, R2, r, a) R2 R1 

2 2 

 

 

 

 

 

2 

ln 

R r 

 

 

R1r 2 2 

R2 r a 

 

2 2 

R1r a 

 

 

Iar 

2 

 

sin 

 

1 R 

0 

2 rco s X( R2, r, a, ) 

R2 

 

 

X( R2, r, a, 

) 

 

 

d 

 

 

sin R1 

 

1 

d 

 

 

 

 

R 

0 

1rco s X( R1, r, a, ) X( R1, r, a, 

) 

 

 

 

2 

Ir rsin R2 

arctan 

2 

 

a 

 

 

 

X( 

R 

0 

2, 

r, a, 

) 

2 

R 

1 

sin d, X( R , r, a, 

) 

 

(24) 

1 

2 2 2 

X( R, r, a, ) R r a 2Rrcos (25) 

2 2 

Ir R2 r R2 r a 

T3( R1, R2, r, a) 

ln 

 

4 2 2 

R1 r R1 r a 

 

 

 

 

2 

I Ir 

co s X( R2, r, a, ) X( R1, r, a, ) d 

2 

4 

0 

 

sincossin R2 

 

1 d 

R 

0 

2 rco s X( R2, r, a, ) X( R2, r, a, 

) 

 

 

 

 

 

0 

 

 

 

 

sincos sin 

R1 

 

1d R1rcos X( R1, r, a, 

) 

X( R1, r, a, 

) 

 

 

 

 

I R2 R1 ; r R1 

 

T ( R , R , r) I 

R r ; R r R 

B 

1 2 2 1 2 

 

 

0; r R2. 

(26) 


 

 

0Ir 

R2 R1; r R 

 

1 

2 

 

3 3 

0Ir 

r R 

1 

TA( R1, R2, r) R2 r ; R 

2 1 r R2 

2 

 

3r 

 

 

3 3 

0I 

R2 R 

1 

 

; r R2 

2 

 

3r 

 

Finally, the inductance calculation can be done using 

the equation for the stored magnetic energy: 

1 2 1 

LI 

2 2 

JS AdV V 

(27) 

1 

L J ( ) ( ) . 

2 S AM r AS r dV 

I 

V 

As can be seen from equation (27), the inductance of ith 

conductor can be separated into two parts: 

Li LiM LiS, 

(28) 

where LiM is the contribution of magnetizing currents 

and LiS is the contribution of imposed currents 

(inductance of conductor in free space). 

The self and mutual inductance calculations of circular 

conductors with rectangular cross section have been done 

in a couple of papers [10]-[13]. Technically, the 

equations for L and M are the same with the difference in 

the limits of integration. The calculation of the selfinductance 

can be observed as the special case of the 

mutual-inductance equation. 

With the assumption of uniform distribution of current 

on conductor's cross section, the total energy stored in the 

magnetic field of the conductor is: 

2 2 

 

Z2 Z4 R2 R4 

0JJ 

1 2 

W cos 

r 

2 

 

 

0zZ1 ZZ3 rR1 RR3 RdRdrdZdzd 

 

r R 2Rrcos zZ 2 

. 

(29) 

Using the expressions: 

1 

I 

W MI1I2; J (30) 

2 

S 

the equation for the mutual inductance of circular 

conductor with rectangular cross section is: 

0 

M 

Q, 

(31) 

( R2 R1)( Z2 Z1)( R4 R3)( Z4 Z3) 

where Q represents the above written quintuple integral 

in (29). 

Using the equivalences: 

Z3 Z1; Z4 Z2; R3 R1; R4 R2; I1 I2, 

(32) 

after analytically solving the four integrals for r, R, z and 

Z, according to [13], the final expression for the selfinductance 

of circular conductor of rectangular cross 

section in free space is: 

2 

0N 

L ( 

2 2 

2, 2, , ) ( 1, 1, 

, ) 

2 1 

Q R R H Q R R H 

R R H 0 

QR ( , R, H, ) QR 

( , R, H, ) d 

1 2 2 1

4 

 

2 

h cos QrRh (, , , ) 

 

 

30sin 

2 hcos 

bh arctan 

sin 

2 2 2 

2 2 

h cosrRsin3h r R cos 

 

2 

hsin bh 20 

 

4 2 

Rhsincos 

hrRcos 

arctan 

2 2 

Rsin bh 

 

4 2 

rhsincos 

hRrcos 

arctan 

2 2 

rsin bh 

 

2 

bh 

cos 

2 4 4 2 2 

3cos 

R r Rrcosr R 

15 

2 

2 2 

2r R 

 

2 

bh 

 

bhcos ln 

 

 

2 

bh h 

 

2 

bh h 

 

2 

2 2 2 4 4 

r R 2cos R r 

 

8 

 

5 2 2 

R sin cos ln rRcos 

5 

b 

5 2 2 

r sin cos ln Rrcos b 

5 

3 2 2 2 2 

R cos 5h 3R sin 

ln rRcos 

15 

(33) 

2 

bh 

3 2 2 2 2 

r cos 5h 3r sin 

ln Rrcos 

15 

 

2 

bh , 

 

 

2 2 

where N is the number of turns, b r R 2rRcos, and H Z2 Z1. 

The integral over in equation (33) 

cannot be written in closed form so it has to be solved 

numerically, solving the singularities in 0 and 

by using the l'Hôpital's rule. 

The above presented method for determining the 

inductance matrix of the power transformer windings has 

been tested for the various conductor positions and 

different magnetic permeabilities of the transformer core. 

Comparison showed that the difference between the 

professional FEM tools (Ansoft Maxwell ® 

software 

package) is way beyond 1%, which proves the accuracy 

and usefulness of the presented method. 

V. CONCLUSION 

In this paper we present the method for fast and precise 

computation of capacitances and inductances of high 

power transformer windings with coils of rectangular 

cross section based on the boundary element method. 

Geometry of the windings is axially symmetric, and the 

model may be reduced to two-dimensional axially 

symmetric problem. Capacitances are computed from 

static electric field solution. Surface charge distribution is 

determined by BEM solution of a pair of coupled integral 

equations for static electric fields. Inductances are 

computed from static magnetic field solution. Surface 

magnetization current distribution is determined by BEM 


solution of a pair of coupled integral equations for static 

magnetic fields. 

Boundaries and interfaces are divided into line and arc 

finite segments, and the unknown distribution of surface 

charge or current density is approximated by piecewise 

constant functions. System of equations for unknown 

coefficients of distribution is obtained by “pointmatching”. 

The testing shows that even very coarse discretization 

results in satisfactory accuracy of the computation and 

therefore proves the applicability of the presented 

method. 

REFERENCES 

[1] E. Bjerkan and H. K. Høidalen, "High frequency FEM-based 

power transformer modeling: investigation of internal stresses due 

to network-initiated overvoltages", International Conference on 

Power Systems Transients (IPST'05), Montreal, Canada, June 

2005. 

[2] Y. Shibuya, T. Matsumoto and T. Teranishi, "Modelling and 

analysis of transformer winding at high frequencies", International 

Conference on Power Systems Transients (IPST'05), Montreal, 

Canada, June 2005. 

[3] K. Pedersen, M. E. Lunow, J. Holboell and M. Henriksen, 

"Detailed high frequency models of various winding types in 

power transformers", International Conference on Power Systems 

Transients (IPST'05), Montreal, Canada, June 2005. 

[4] G. Liang, H. Sun, X. Zhang, X. Cui, “Modeling of Transformer 

Windings Under Very Fast Transient Overvoltages” IEEE 

Transactions on Electromagnetic Compatibility, Vol. 48, No 4, 

November 2006. 

[5] M. Popov, L. van der Sluis, R. P. P. Smeets and J. L. Roldan, 

"Analysis of very fast transients in layer-type transformer 

windings", IEEE Transactions on Power Delivery, Vol. 22, No. 1, 

pp. 238-247, January 2007. 

[6] Ž. Štih, “High Voltage Insulating System Design by Application 

of Electrode and Insulator Contour Optimization”, IEEE 

Transactions on Electrical Insulation, Vol. EI-21, No.4, August 

1986. 

[7] P. Zhu, "Field distribution of a uniformly charged circular arc", 

Journal of Electrostatics, Vol. 63, pp. 1035-1047, March 2005. 

[8] Z. Haznadar, Ž. Štih, "Electromagnetic Fields, Waves and 

Numerical Methods", IOS Press, Amsterdam 2000. 

[9] J. T. Conway, "Trigonometric integrals for the magnetic field of 

the coil of rectangular cross section", IEEE Transactions on 

Magnetics, Vol. 42, No. 5, pp. 1538-1548, May 2006. 

[10] S. I. Babic and C. Akyel, "New analytic-numerical solutions for 

the mutual inductance of two coaxial circular coils with 

rectangular cross section in air", IEEE Transactions on Magnetics, 

Vol. 42, No. 6, pp. 1661-1669, June 2006. 

[11] J. T. Conway, "Inductance calculations for circular coils of 

rectangular cross section and parallel axes using Bessel and Struve 

functions", IEEE Transactions on Magnetics, Vol. 46, No. 1, pp. 

75-81, January 2010. 

[12] D. Yu, K. S. Han, "Self-Inductance of Air-Core Circular Coils 

with Rectangular Cross Section", IEEE Transactions on 

Magnetics, Vol. 23, No. 6, pp. 3916-3921, November 1987. 

[13] I. Doležel, "Self-inductance of an air cylindrical coil", Acta 

, Vol. 34, No. 4, pp. 443-473, 1989.


Numerical and Experimental Investigations of 

the Structural Characteristics of Stator Core 

Stacks 

Mathias Mair ∗ , Bernhard Weilharter †‡ , Siegfried Rainer § , Katrin Ellermann ∗ and Oszkár Bíró †§ 

∗ Institute for Mechanics, University of Technology Graz, Austria 

† Christian Doppler Laboratory for Multiphysical Simulation, Analysis and Design of Electrical 

Machines, Austria 

‡ Institute for Electric Drives and Machines, University of Technology Graz, Austria 

§ Institute for Fundamentals and Theory in Electrical Engineering, University of Technology Graz, Austria 

E-mail: mair@tugraz.at 

Abstract—The response characteristics of two stator core stacks are investigated by experimental modal analysis. 

Furthermore, the modal parameters like the eigenfrequencies and eigenvectors are calculated from a numerical modal 

analysis. Afterwards, their frequency response functions are computed and compared with the measured frequency response 

functions. In order to achieve a set of material parameters, the computed response characteristics are adjusted to match 

the measured response characteristics. 

Index Terms—experimental modal analysis, homogeneous material model, numerical modal analysis, stator core stack 


The development of electrical machines requires an accurate 

dynamical analysis in order to reduce side effects 

of vibration, e.g. noise and material damage. Since the 

electrical machine consists of many complex and heterogeneous 

parts, like the stator core stack, the mechanical 

modeling of an electrical machine is a complicated task. 

Especially for the noise computation of electrical machines, 

the structural behavior of the stator core stack is 

of interest. It is mainly influenced by forces caused by 

the magnetic field in the air gap acting on the stator teeth 

[1]–[4]. An analytical method proposed by [5] allows 

for the computation of the stator vibration with a twodimensional 

ring model. However, the disadvantage of 

this method is that it is not possible to consider the 

response characteristics along the axial direction. Other 

approaches, considering the stator core stack as a thin 

cylinder, have been investigated in [6]–[11]. 

With numerical methods, e.g. the finite element method 

(FEM), the three dimensional behavior of the stator can 

be determined [12], [13]. The main problem is to set up 

an appropriate material model for the numerical analysis, 

which considers the heterogeneous composition of the 

stator core stack consisting of laminated sheets coated 

with resin [14]. 

Experimental investigations of models consisting of 

laminated iron sheets and a comparison with numerical 

simulations using three-dimensional homogeneous models 

has been presented in [15]. Another investigation of a 

stator core stack has been done by [16] in order to acquire 

isotropic material parameters. With this approach, it is 

possible to calculate modes with pure radial displacement 

adequately. A similar investigation has been presented by 

[17] with the difference that the used FEM-model has 

been computed with a material model of transversally 

isotropic elasticity. However, the measurement points 

have not been uniformly distributed on the outer ring 

surface. As a consequence, the measurement result of the 

response characteristics of the stator core stack is limited. 

In this paper, the three dimensional structural vibration 

behavior of stator core stacks is investigated by using 

the finite element method in conjunction with a modal 

analysis. The influence of the lamination along the axial 

direction will be considered by using a homogeneous material 

model with transversally isotropic elasticity. For the 

identification of the corresponding material parameters, 

two finite element models of stator core stacks have been 

set up, one with teeth and one without teeth. A modal 

analysis is carried out to determine the eigenfrequencies 

and eigenforms (modes) of the finite element models. 

Thereafter the frequency response characteristics of the 

two structures are computed with a reduced order model 

by applying a modal reduction. 

Acceleration measurements for which the structures 

have been excited with an electrodynamic shaker in a 

frequency range of 0 − 6000 Hz have been performed 

at 60 points on the stator core stacks. An experimental 

modal analysis then provides the measured response 

characteristics and eigenfrequencies and eigenforms [18]. 

Finally, the results of the numerical investigation are 

compared with the results from the experimental modal 

analysis. The material parameters are adjusted step by 

step until the response characteristics of the numerical 

model approximate the measured one sufficiently. This 

way a set of material parameters for homogeneous material 

models for the stator core stacks is obtained, which 

describes the structural behavior adequately.

II. EXPERIMENTAL MODAL ANALYSIS (EMA) 

An experimental modal analysis allows for the determination 

of the response characteristics of the stator core 

stacks excited by a shaker. For this, acceleration sensors 

are recording the vibration at distinct measurement points 

on the stator core stacks. Then, the characteristic response 

behavior can be derived, transforming the resulting time 

signals into the frequency domain. 

A. Test stand for experimental modal analysis 

In order to measure the vibration on the stator core 

stacks, a test stand as shown in Fig. 1 has been built. 

Ropes composed of an elastic material are connected 

to a portal frame and suspend the stator core stack, 

additionally the table is mounted on air bearings. This 

reduces the influence of the adjacent structure to a minimum. 

In order to excite the structure, an electromagnetic 

shaker is mounted on the table. The connection between 

structure and shaker is realized by a push rod and a 

force sensor affixed to the test structure with a twocomponent 

adhesive. This way, the measurement setup 

can be built up and disassembled easily without machine 

tools. For the measurement, the shaker is controlled by 

a measurement system which also records the signals of 

the acceleration sensors. 

Fig. 1: Test stand 

B. Measurements procedure 

The measurement points are located on the outer 

surface of the structure at sixty defined positions, see 

Fig. 2. The excitation is applied to point no. 1 for all 

measurements. At this point, the structure is excited by 

the shaker with a so-called periodic chirp signal in a 

frequency range from 2 Hz to 6400 Hz. The applied signal 

is repeated ten times with the entire frequency range 

passed through in each sequence within a short time 

period of 2.5 s. 

In the course of the measurement procedure, the frequency 

response function (FRF) in reference to the excitation 

point is determined for each measurement point. 

Finally, the arithmetically averaged acceleration and force 

signals of these ten measurement runs are used to derive 

the FRFs corresponding each of the measurement points. 


55 

43 

31 

19 

7 

58 

46 

34 

22 

10 

49 

37 

25 

13 

Fig. 2: Defined measurement points 

C. Identification of modal parameters 

After all FRFs are measured, the next step is to 

identify the modal parameters. This is done by the so 

called PolyMAX frequency-domain method [19], which 

is a curve fitting technique. In a first step a least-squares 

method fits the polynomials 

p 

 

p 

−1 V0(Ω) = 

(1) 

z 

i=0 

i βi z 

i=0 

i αi 

to the measured FRFs. Thereby, Ω denotes the excitation 

frequency and V0(Ω) is called the frequency response 

matrix. βi are the coefficient numerator matrices, αi are 

the coefficient denominator matrices, zi are the complex 

basis vectors in the discrete frequency domain and p is 

the order of the polynomials. 

With the known polynomial functions, the eigenvalues 

λi and the modal participation vectors li can be calculated. 

To determine the eigenvectors ri as well as the 

lower and upper residual matrices LR and UR, a further 

least square method, based on the pole residual model 

q 

 

ril 

V(Ω) = 

i=0 

T i 

λi − jΩ + rilH 

i 

+ 

λi − jΩ 

LR 

+UR , (2) 

Ω2 is applied. The dashed symbols mark the conjugate 

complex variables. The determined frequency response 

matrix V(Ω) 

⎡ 

⎤ 

f11 f12 ··· f1m 

⎢ 

. 

⎢ 

. 

⎥ 

f21 f22 

V(Ω) = ⎢ 

. ⎥ 

⎢ . 

⎣ 

. ⎥ 

(3) 

. 

.. ⎦ 

fn1 ··· fnm 

is filled with the frequency response functions fkl between 

the excitation point l and the measuring point k. 

The size of V(Ω) is therewith defined by m excitation 

points times n measuring points. 

The identified modal parameters and frequency response 

functions fkl allow a convenient description of 

the measured response characteristics for the later comparison 

with numerical results. 

D. Measurement results 

In order to determine material parameters for a specific 

numerical model, the mode-shapes determined with the 

1

EMA, corresponding to the estimated eigenvalues, must 

be identified. To distinguish the different mode-shapes, 

a numbering system is established. The numbering comprises 

three numbers and refers to a cylindrical coordinate 

system. The first digit describes the number of maxima 

of the mode in radial direction along the azimuthal 

coordinate axis, see Fig. 3. The second digit represents 

the number of zero crossings of the mode in radial 

direction along the z-axis. The third digit is a counter to 

differentiate modes with the same deformation properties 

regarding the first two digits. 

z 

2 

y 

Fig. 3: Example for a mode (3,2,0) 

1) Results of the stator core stack without teeth: 

The sum of all measured as well as the sum of all 

approximated frequency response functions fkl, the latter 

calculated by the identified modal parameters, are depicted 

in Fig. 4, with the different mode patterns indicated 

by the introduced numbering system. 

magnitude [ m N ] 

10 -5 

10 -6 

10 -7 

(2,1,0) 

(2,0,0) 

(2,2,3) 

(2,2,2) 

(2,2,1) 

(2,2,0) 

(3,2,2) 

(3,2,1) 

(3,2,0) 

(3,1,0) 

(3,0,0) 

(3,1,1) 

(3,2,3) 

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 

(4,0,0) 

(4,0,1) 

(3,3,0) 

(3,4,0) 

(3,3,1) 

(3,4,1) 

(3,3,2) 

(3,4,2) (4,2,0) 

(3,3,3) (3,4,3) (4,2,1) 

(3,3,4) 

(4,2,2) 

(3,3,5) 

(4,4,0) 

(4,4,1) 

3 

Sum of all approximated FRF‘s, using modal parameters 

Sum of all measured FRF‘s 

frequency [Hz] 

x 

(5,0,0) 

(5,4,0) 

(5,4,1) 

(5,4,2) 

(5,4,3) 

Fig. 4: Sum of all frequency response functions of the 

stator core stack without teeth 

As Fig. 4 shows, with the increase of the excitation 

frequency the mean of the magnitude decreases. This 

corresponds to mass dominated dynamical behavior. [20, 

p.294] 


The modes (2,0,0), (2,1,0), (3,0,0), (3,1,0), (4,0,0), 

(4,0,1) and (5,0,0) have the most distinctive magnitude in 

the investigated frequency domain. Some of these modes, 

(2,0,0), (3,0,0), (4,0,0) and (5,0,0) could be calculated by 

analytical two-dimensional methods, see [7], [5] or [10]. 

Other modes like (2,2,0) or (3,1,1) with non-uniform 

radial displacements along the z-axis can only be treated 

by three-dimensional approaches. 

Table I summarizes the measurement results for the 

stator core stack without teeth and lists modes with their 

appropriate eigenfrequencies and damping ratios. 

TABLE I: Modes, eigenfrequency and damping ratio of 


mode num. eigenfrequency damping ratio 

1/(2, 0, 0) 769,22 Hz 0,411387 % 

2/(2, 1, 0) 795,03 Hz 0,959535 % 

3/(2, 2, 0) 1282,95 Hz 1,197700 % 

4/(2, 2, 1) 1353,39 Hz 1,086200 % 

5/(3, 2, 0) 1728,14 Hz 1,442430 % 

6/(3, 1, 0) 2092,07 Hz 0,218612 % 

7/(3, 0, 0) 2109,96 Hz 0,093046 % 

8/(3, 1, 1) 2192,46 Hz 1,122050 % 

9/(3, 2, 0) 2278,98 Hz 0,691529 % 

10/(3, 3, 0) 2509,73 Hz 0,897709 % 

11/(3, 3, 1) 2569,24 Hz 1,057550 % 

12/(3, 3, 2) 2631,76 Hz 1,108720 % 

13/(4, 0, 0) 3860,04 Hz 0,043218 % 

14/(4, 0, 1) 3871,37 Hz 0,050271 % 

15/(3, 4, 0) 3947,18 Hz 0,496143 % 

16/(3, 4, 1) 4037,01 Hz 1,110130 % 

17/(4, 2, 0) 4414,03 Hz 0,606060 % 

18/(4, 2, 1) 4482,12 Hz 0,590154 % 

19/(4, 4, 0) 4756,58 Hz 0,348687 % 

20/(4, 4, 1) 4773,74 Hz 0,360285 % 

21/(5, 4, 0) 4940,08 Hz 0,930774 % 

22/(5, 0, 0) 5927,25 Hz 0,127039 % 

2) Results of the stator core stack with teeth: Similarly 

the results given in Fig. 4, the sum of all measured and 

the sum of all approximated frequency response function 

fkl for the stator core stack with teeth, are plotted in Fig. 

5. 


10 -5 

10 -6 

10 -7 

1e-8 

(2,0,1) 

(2,0,0) 

(2,1,0) 

(2,1,1) 

(2,2,1) 

(2,2,0) 

(3,0,0) 

(3,1,0) 

(0,0,1) 

(0,0,2) 

(3,2,0) 

(4,0,0) 

(4,3,2) 

(4,3,1) 

(4,3,0) 

Sum of all approximated FRF‘s, using modal parameters 

Sum of all measured FRF‘s 

(4,0,1) 

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 


(4,1,0) 

(4,3,7) 

(4,3,6) 

(4,3,3) 

(4,3,4) 

(4,3,5) 

(5,1,0) 

(5,0,0) (6,1,0) 

(6,0,0) 

(0,1,0) 

(5,2,0) 

(5,4,0) 

(5,2,1) 

(5,4,1) 

(5,4,2) 

(5,4,3) 


stator core stack with teeth

Comparing the FRFs depicted in Fig. 4 and Fig. 5, 

the number of distinct modes in Fig. 5 is greater. Due 

to the higher mass, the corresponding eigenfrequencies 

of the stator core stack with teeth are lower than those 

of the stator core stack without teeth. Furthermore, the 

frequency spacing of distinct modes, e.g. (3,0,0) and 

(3,1,0) increases. 

It is impossible to identify modes which have higher 

eigenfrequencies than 5400 Hz, because the used measurement 

grid is too coarse to detect the appropriate 

eigenforms. In this frequency range, one would expect 

to see modes comprising seven maxima or more with a 

radial displacement along the azimuthal axis. This is not 

possible to identify with only twelve measurement points 

in the azimuthal direction. 

Table II lists measurement results for the stator core 

stack with teeth. 

TABLE II: Modes, eigenfrequency and damping ratio of 

stator core stack with teeth 

mode num. eigenfrequency damping ratio 

1/(2, 0, 0) 661.27 Hz 0.054903 % 

2/(2, 0, 1) 667.84 Hz 0.055824 % 

3/(2, 1, 0) 720.08 Hz 0.337501 % 

4/(2, 1, 1) 727.85 Hz 0.332884 % 

5/(2, 2, 0) 1365.85 Hz 1.005810 % 

6/(2, 2, 1) 1416.91 Hz 0.939553 % 

7/(3, 0, 0) 1767.43 Hz 0.047843 % 

8/(3, 1, 0) 1851.83 Hz 0.205125 % 

9/(3, 2, 0) 2313.17 Hz 0.640636 % 

10/(0, 0, 1) 2372.92 Hz 0.654953 % 

11/(0, 0, 2) 2376.10 Hz 0.608697 % 

12/(4, 3, 0) 2755.43 Hz 0.382039 % 

13/(4, 0, 0) 3107.52 Hz 0.097675 % 

14/(4, 0, 1) 3116.28 Hz 0.152128 % 

15/(4, 1, 0) 3190.75 Hz 0.180018 % 

16/(4, 3, 3) 3288.28 Hz 0.743811 % 

17/(0, 1, 0) 3955.81 Hz 0.306280 % 

18/(5, 2, 0) 4074.92 Hz 0.188232 % 

19/(5, 0, 0) 4423.63 Hz 0.046342 % 

20/(5, 1, 0) 4484.19 Hz 0.173786 % 

21/(5, 4, 0) 4655.18 Hz 0.661100 % 

22/(5, 4, 1) 4745.87 Hz 0.237239 % 

23/(6, 0, 0) 5314.37 Hz 0.031471 % 

24/(6, 1, 0) 5356.55 Hz 0.039125 % 

III. NUMERICAL MODAL ANALYSIS 

As a means for the numerical simulation of the dynamical 

behavior of the stator core stacks, the finite element 

method is used. Therefore, an adequate simulation model 

has to be set up. 

For the numerical model based on the FEM - model, 

some simplifications are made: 

• the contacts between the laminations are neglected 

• the grain orientation of the cold rolled silicon-ironalloy 

is neglected 

• a linear and homogeneous material model is assumed 

• the FEM model is assumed to be linear 


Using an adequate FEM-model and performing a numerical 

modal analysis, the modal parameters (eigenfrequencies, 

eigenforms, and damping coefficients) which 

can be directly related to the modal parameters from the 

measurements are obtained. 

A. Material model 

By considering the above limitations, a material model 

with transversally isotropic elasticity is implemented. 

z 

Fig. 6: Coordinate system of the stator core stack 

The used transversally isotropic elasticity of the material 

model corresponds to the coordinate system, depicted 

in Fig. 6. The flexibility matrix S for a material model 

with a transversally isotropic elasticity is given by 

⎡ 

⎢ 

S = ⎢ 

⎣ 

1 

Ex 

νxy 

− Ex 

1 

Ex 

νzy 

− Ex 

νxz 

− Ez 

νyz 

− Ez 

1 

Ez 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

2(1+νxy) 

Ex 

0 

0 

0 

1 

Gxz 

0 

0 

0 

1 

Gxz 

− νyx 

Ex 

− νzx 

Ex 

x 

y 

⎤ 

⎥ , (4) 

⎥ 

⎦ 

where E is Young’s modulus, G is the shear modulus and 

ν is Poisson’s ratio. The material model and therefore 

the flexibility matrix S of the transversally isotropic 

elasticity has rotationally symmetrical properties. Due to 

the symmetric properties of S, the following conditions 

are valid: 

νxy = νyx 

νxz 

Ez 

= νzx 

Ex 

B. Modal description 

Based on the finite element method, the equations of 

motion of a structural damped system can be formulated 

as the linear system of differential equations 

(5) 

(6) 

M ¨ û + D ˙ û + Kû = f . (7) 

Thereby, M is the mass matrix, K is the stiffness matrix, 

f is the excitation force and D is the proportional

damping matrix defined by the Rayleigh damping model 

[12, p.950ff] 

D = α M + β K . (8) 

as a linear combination of the mass matrix M and the 

stiffness matrix K with the scalar coefficients α and β. 

Solving the eigenvalue problem of the undamped 

system leads to the eigenvalues λi and eigenvectors 

ri. The modal matrix R consisting of mass-normalized 

eigenvectors defines the transformation 

û = Rz (9) 

of the displacement in the global state space û to the 

displacement in the modal state space z. 

Using (9) and multiplying (7) with the transformed 

modal matrix R from the left, the proportional damped 

equations of motion are transformed into a noninteracting 

system of the dimension q 

˜M ¨z + ˜ D ˙z + ˜ Kz = ˜ f . (10) 

Since, the eigenvectors are mass-normalized, the transformation 

of the mass matrix M into the modal state 

space leads to the identity matrix I 

˜M = R T MR= I . (11) 

The transformed stiffness matrix becomes 

˜K = R T KR= Λ = diag ω 2 i , (12) 

where ωi is the undamped angular eigenfrequency of the 

i−th mode. The damping matrix yields a diagonal matrix 

expressed as 

˜D = R T DR= diag [2ζiωi] , (13) 

Thereby ζi denotes the modal damping ratio of the i−th 

mode [20, p.63ff]. 

This approach yields simultaneously a modal reduction 

of the equation system. This reduction has the advantage 

that, without it, the computing effort increases rapidly. 

The disadvantage is that the disregarded modes create 

an error in the response characteristics. However, the 

error resulting from the material model with transversally 

isotropic elasticity is expected to be much higher than this 

error and thus the latter is neglected. 

Assuming a harmonic excitation, the excitation force 

can be expressed as 

f = ˇ f e jΩt 

(14) 

in the frequency domain. Here, ˇ f is the amplitude 

of the excitation force and Ω describes the excitation 

frequency. This leads to a harmonic ansatz for the modal 

displacement: 

z = ˇz e jΩt 

(15) 

where ˇz denotes the amplitude of the displacement in the 

modal state space. Hence, (10) becomes 

−Ω 2 I ˇz + jΩ ˜ Dˇz + ˜ Kˇz = ˇ f . (16) 


Finally, the backward transformation in the global state 

space leads to 

ǔ = R 

 

−Ω 2 I + jΩ ˜ D + ˜ −1 K 

R T ˇ f = ˜V ˇ f . (17) 

Therewith, the numerically estimated frequency response 

matrix ˜V is 

 

˜V (Ω) = R −Ω 2 I + jΩ ˜ D + ˜ −1 K R T 

(18) 

Using (11), (13) and (12), the entries of ˜V (Ω) can be 

expressed by the frequency response functions ˜ fkl 

˜fkl(Ω) = 

q 

i=1 

ir ∗ k ir ∗ l 

−Ω 2 + ω 2 i + jΩ 2ζi ωi 

, (19) 

which can be related directly to the corresponding frequency 

response functions fkl(Ω) estimated by the measurement. 

Thereby ir∗ k denotes the k-th entry of the i-th 

mass-normalized eigenvector ri. 

IV. INFLUENCE OF MATERIAL PARAMETERS ON 

TRANSMISSION BEHAVIOR 

In order to estimate the influence of each material 

parameter, simulations as explained in section III-B are 

carried out for the stator model without teeth. The 

material parameters, except for the density, are varied 

in a distinct range and their influence on the structural 

behavior is investigated. 

The density is determined by measurements. With a 

mass of 149.8kg and a volume from 0.019905 m 3 , a 

density of 7525 kg/m 3 results. This value is used for all 

calculations of this study. 

The material parameters of interest for the influence 

on the dynamical behavior are the Young’s moduli Ex 

and Ez, the shear module Gxz and the Poisson ratios 

νxy and νxz. The initial set of material parameters is 

shown in Table III. Based on this set, each parameter is 

TABLE III: Initial dataset of material parameters for a 

variation of each parameter 

material parameters values 

Ex 

190 · 109 N/m 2 

Ez 

25 · 109 N/m 2 

Gxz 

10 · 109 N/m 2 

νxy 

0.3 

νxz 

0.3 

ϱ 7525 kg/m 3 

varied to a lower and a higher value. The influence of 

each parameter on the frequency response functions is 

shown below. The results of these investigations are the 

basis on which the adjustment of the simulated response 

characteristics on the measured response characteristics 

is done.

A. Variation of the Poisson ration ν 

In order to analyse the influence of the Poisson ratios 

on the response characteristics, the values are varied from 

0.2 to 0.4 . The Poisson ratios νxy and νxz are set equal 

for the calculations. Fig. 7 shows the sum of all frequency 

response functions for the varied Poisson ratios. 


-4 

10 v = 0.2 

v = 0.3 

v = 0.4 

10 -5 

10 -6 

10 -7 

500 1000 1500 2000 2500 3000 



stator core stack without teeth for varied Poisson ratio 

This comparison evidences that the influence of the 

Poisson ratios on the response characteristics is insignificantly 

small. Therefore, for νxy and νxz a value of 0.3 

is chosen. 

B. Variation of the elastic modulus Ex 

In Fig. 8, the sum of the calculated frequency response 

functions for the variation Ex is depicted. It can be ob- 


10 -4 

10 -5 

10 -6 

10 -7 

1 

2 

4 

5 

11 11a 

500 1000 1500 


2000 2500 3000 

7 

8 

9 2 

E x = 170· 10 N/m 

9 2 

E x = 190· 10 N/m 

9 2 

E x = 210· 10 N/m 

Fig. 8: Sum of all frequency response functions of 

the stator core stack without teeth for varied Young’s 

modulus Ex 

served that some eigenfrequencies, for example for mode 

4, 5 or 11, are not influenced by the Young’s modulus Ex. 

Other modes, such as 7, 8, or 11a, are heavily affected by 

it. A small variation of the Young’s modulus Ex yields 

a large shift of distinct eigenfrequencies. If the value of 

Ex decreases, the eigenfrequencies corresponding to the 

modes 7, 8, or 11a are declining and vice versa. Also 


the eigenfrequencies of the corresponding modes 1 and 

2 depend on the Young’s modulus Ex but not as much 

as the previous ones. 

C. Variation of the elastic modulus Ez 

The variation of the material parameter Ez, depicted 

in Fig. 9, leads to a different dynamical behavior than 

the variation of Ex. The eigenfrequencies corresponding 

to the modes 1, 2 and 7 are not influenced by a variation 


10 -4 

10 -5 

10 -6 

10 -7 

2 

1 

3 

4 

500 1000 1500 


2000 2500 3000 

7 

9 2 

E z = 20· 10 N/m 

9 2 

E z = 25· 10 N/m 

9 2 

E = 30· 10 N/m 

Fig. 9: Sum of all frequency response functions of 

the stator core stack without teeth for varied Young’s 

modulus Ez 

of the Young’s modulus Ez. When increasing the value 

of Ez, all other eigenfrequencies are shifted upwards 

in the considered frequency range and when decreasing 

Ez, these eigenfrequencies are shifted downwards. It is 

interesting to note, that the eigenfrequencies which are 

independent of the Young’s modulus Ez (mode 1, 2, 7), 

depend on the Young’s modulus Ex. 

D. Variation of the shear modulus Gxz 

Finally, theresults for the variation of the shear modulus 

Gxz is shown in Fig. 10. It can be seen that the 


10 -4 

10 -5 

10 -6 

10 -7 

1 

2 

500 1000 1500 


2000 2500 3000 

6 

9 

z 

10 

9 2 

G xz = 8· 10 N/m 

9 2 

G xz = 10· 10 N/m 

9 2 

G = 12· 10 N/m 


stator core stack without teeth by varied shear modulus 

Gxz 

5 

xz 

11 

12

eigenfrequencies corresponding to the modes 1 and 2 are 

independent of the variation of Gxz. Other modes, like 

5, 6, 9, 11 or 12, are heavily affected by the variation 

of the shear modulus. These eigenfrequencies are shifted 

downwards by decreasing and upwards by increasing the 

value of Gxz. 

Summing up, the influence of the material properties 

on the eigenfrequencies is strongly influenced by the 

corresponding eigenmode occurring at these frequencies. 

Depending on the characteristics (radial, azimuthal or axial 

bending) of the mode, Ex, Ez and Gxz have different 

influences. This background is important to know for the 

latter adjustment of the response characteristics. 

V. ADJUSTMENT OF MATERIAL PARAMETERS 

The dynamical behavior of the numerical model 

strongly depends on the chosen material parameters. 

An iterative process optimizes the eigenfrequencies and 

eigenvectors based on the known influence of the material 

parameters as discussed in section IV. Material 

parameters can be found by an adequate adjustment of 

the measured and simulated response characteristics. 

A. Stator core stack without teeth 

First, a dataset of material parameters is chosen which 

describes the behavior of isotropic elasticity. Thereafter, 

the material parameters of the transversally isotropic 

elasticity are identified for the stator core stack without 

teeth. 

1) Comparison of measured and calculated frequency 

response functions by using the isotropic dataset: The 

results from the numerical simulation using isotropic 

material parameters (Table IV) are compared with the 

results from the experimental investigation, see Fig. 11. 

It can be seen that there is no analogy between the simu- 

TABLE IV: Initial dataset of material parameters of 

isotropic elasticity 


Ex 

210 · 109 N/m 2 

Ez 

210 · 109 N/m 2 

Gxz 

Ex 

2(1+ν) 

ν 0.3 

ϱ 7525 kg/m 3 

lated and measured response characteristics. Only the first 

eigenfrequency of the measured and simulated FRFs are 

similar. Furthermore, in the investigated frequency range 

less eigenfrequencies arise for the simulation model. This 

comparison shows that a material model with isotropic 

elasticity is clearly unsuitable. 

2) Adjustment of the measured and simulated frequency 

response function by using transversal isotropic 

elasticity: In a next step, a material model with transversally 

isotropic elasticity is used and the needed material 

parameters are adjusted. Table V lists these material 

parameters used as an initial dataset for the simulation. 



10 -5 

10 -6 

10 -7 

10 -8 

Sum of simulated FRF‘s in radial direction 

Sum of measured FRF‘s in radial direction 

500 1000 1500 2000 2500 3000 


Fig. 11: Comparison of the sum of all measured and 

calculated FRFs with isotropic material model of the 


Thereby, the Young’s modulus Ez and the shear modulus 

Gxz are chosen considerably lower with 40·10 9 N/m 2 and 

15 · 10 9 N/m 2 . This shifts the eigenfrequencies downward 

in the considered frequency range as explained in section 

IV. 

Looking at Fig. 12, it can be seen that in the regarded 

frequency range the measured and simulated eigenfrequencies 

of the modes (2,0,0) and (3,0,0) are close 

together. 

TABLE V: Dataset of material parameters of transversal 



10 -5 

10 -6 

10 -7 


Ex 

210 · 109 N/m 2 

Ez 

40 · 109 N/m 2 

Gxz 

15 · 109 N/m 2 

νxy 

0.3 

νxz 

0.3 

ϱ 7525 kg/m 3 

mode (2,0,0) 



mode (3,0,0) 

10 

500 1000 1500 2000 2500 3000 

-8 


Fig. 12: Comparison of the sum of all measured and calculated 

FRFs with transversally isotropic material model 

of the stator core stack without teeth 

Now the modes are adjusted by decreasing the elastic

modulus Ex. The value is optimized manually step-bystep 

until an adequate match is attained. For the Young’s 

modulus Ex, a value could be found which aligns the 

measured and simulated modes (2,0,0) and (3,0,0). The 

next step in the optimization is to adjust the mode 

(3,1,0). Therefore the shear modulus Gxz is reduced. 

Finally, with the Young’s modulus Ez, the other modes 

between (2,0,0) and (3,0,0) can be influenced. A stepwise 

reduction of this value yields an adequate correlation of 

the other modes. 

The resulting material parameters of this optimization 

of the stator core stack without teeth are listed in Table 

VI. 

TABLE VI: Resulting dataset of material parameters 

for the stator core stack without teeth of transversally 



Ex 

191, 8 · 109 N/m 2 

Ez 

24, 7 · 109 N/m 2 

Gxz 

11 · 109 N/m 2 

νxy 

0, 3 

νxz 

0, 3 

ϱ 7525 kg/m 3 

Fig. 13 depicts the response characteristics of the 

simulation results, using optimized material parameters 

and the measurement results for the stator core stack 

without teeth. A good approximation for the simulated 


10 -5 

10 -6 

10 -7 

mode (2,0,0) 

Mode mode (2,0,0) (2,1,0) 

Mode mode (2,2,0) 

mode (2,2,1) 

mode (2,2,3) 

mode (3,1,0) 

mode (3,2,1) 



mode (3,0,0) 

mode (3,2,3) 

mode (3,3,0) 

mode (3,3,4) 

10 

500 1000 1500 2000 2500 3000 

-8 


Fig. 13: Comparison of the sum of all measured and 

calculated FRFs of the stator core stack without teeth, 

by using a material model with transversally isotropic 

elasticity and optimized parameters 

response characteristics can be observed. Hence, the 

identified material parameters for a linear and homogeneous 

material model can represent a similar dynamical 

behavior as the real structure of the stator core without 

teeth in a frequency range from 0Hzto 3000 Hz. 

The coincident eigenfrequencies and their corresponding 

modes are listed in Table VII. 


TABLE VII: Coincident measured and simulated eigenfrequencies 

and modes of the stator core stack without 

teeth, resulting from adjustment 

modes measured eigenfreq. simulated eigenfreq. 

(2, 0, 0) 769.22 Hz 749.11 Hz 

(2, 1, 0) 795.03 Hz 757.16 Hz 

(2, 2, 0) 1280.95 Hz 1278.38 Hz 

(2, 2, 1) 1353.39 Hz 1362.88 Hz 

(2, 2, 3) 1606.08 Hz 1607.62 Hz 

(3, 2, 1) 1881.54 Hz 1869.87 Hz 

(3, 1, 0) 2092.07 Hz 2095.87 Hz 

(3, 0, 0) 2109.96 Hz 2097.67 Hz 

(3, 2, 3) 2278.98 Hz 2313.00 Hz 

(3, 3, 0) 2509.72 Hz 2512.20 Hz 

(3, 3, 4) 2826.32 Hz 2830.84 Hz 

B. Stator core stack with teeth 

As an initial dataset for the stator core stack with teeth, 

the resulting material parameters of the stator core stack 

without teeth have been chosen. 

For the investigation of the stator core stack with 

teeth the density has to be determined. With a weight of 

196.4kgand a volume of 2.61335·10−2 m3 the density of 

the stator core stack with teeth results in 7515.3 kg/m 3 . 

This is 0.14 % less than the density of the stator core 

stack without teeth. Hence, all simulations for the stator 

core stack with teeth use the newly found density. 

1) Comparison of measured and calculated frequency 

response functions by using the initial dataset: The identified 

material parameters are validated by a comparison 

of the measured and the calculated response characteristics 

of the stator core stack with teeth, see Fig. 14. The 

comparison shows that the match of the measured data 

with the simulation results using the material parameters 

in Table VI for the stator core stack without teeth is not 

satisfactory. Only the modes (2,0,0) and (3,0,0) have a 

smaller deviation than the other modes. Therefore, the 

material parameters of the stator core stack with teeth 

are determined again. 


10 -5 

mode (2,0,0) 

10 -6 

10 -7 

10 -8 

mode (2,1,0) 

mode (3,0,0) mode (3,1,0) 

mode (2,2,x) 



mode (3,2,x) 

mode (3,3,x) 

500 1000 1500 


2000 2500 3000 

Fig. 14: Validation of the identified material parameters 

by comparing results of a the simulation and measurement 

of the stator core stack with teeth

2) Adjustment of measured and simulated frequency 

response function by using transversal isotropic elasticity: 

The procedure of the adjustment of the material 

parameters is the same as for the stator core stack without 

teeth. For the initial dataset, the material parameters 

identified in section V-A are used. The optimization 

yields a set of material parameters listed in Table VIII. 

TABLE VIII: Dataset of material parameters of transversally 

isotropic elasticity with optimized Ez 


Ex 

199.8 · 109 N/m 2 

Ez 

20.1 · 109 N/m 2 

Gxz 

9.9 · 109 N/m 2 

νxy 

0.3 

νxz 

0.3 

ϱ 7515.3 kg/m 3 

Fig. 15 shows that eigenfrequencies exist in the simulation 

which do not occur in the measurement. Furthermore 

the linear and homogeneous model does not adjust 

the modes (2,1,0), (3,1,0) and (4,1,0) to the measured 

eigenfrequencies of the equivalent simulated modes. The 

measured distance in the frequency range of about 100 Hz 

between the modes (2,0,0) and (2,1,0) or (3,0,0) and 

(3,1,0) etc. could not be represented. 


10 -5 

10 -6 

10 -7 

10 -8 

mode (2,0,0) 

mode (2,1,0) 

mode (3,0,0) 

mode (2,3,0) 

mode (2,2,x) 

mode (2,2,0) 

mode (3,1,0) 



mode (3,2,x) 

mode (3,3,0) 

mode (4,3,2) 

mode (4,3,1) 

mode (4,0,0) 

500 1000 1500 2000 


2500 3000 

Fig. 15: Comparison of measured and simulated frequency 

response functions of the stator core stack with 

teeth in a frequency domain from 500 Hz to 3300 Hz 

Table IX lists modes and their corresponding measured 

and simulated eigenfrequencies which could be approximately 

adjusted in a frequency range from 500 Hz to 

3300 Hz. 

VI. SUMMARY AND CONCLUSION 

For the investigation of electrical machines, the dynamical 

behavior of the stator core is of high interest. The 

mechanical characterization is difficult since the structure 

of the stator core is inhomogeneous. In this paper, an 

approach has been presented which yields a linear and 

homogeneous description of a stator core stack. 

For the investigation of the dynamical behavior, two 

stator core stacks, one without stator teeth and the other 


TABLE IX: Coincident measured and simulated eigenfrequencies 

and modes of the stator core stack with teeth, 

resulting from adjustment 

modes measured eigenfreq. simulated eigenfreq. 

(2, 0, 0) 661.27 Hz 662.79 Hz 

(2, 1, 0) 720.08 Hz 611.87 Hz 

(2, 2, 0) 1365.85 Hz 1358.87 Hz 

(3, 0, 0) 1767.43 Hz 1777.50 Hz 

(3, 1, 0) 1854.83 Hz 1782.64 Hz 

(4, 3, 1) 2819.37 Hz 2955.25 Hz 

(4, 3, 2) 2962.95 Hz 2955.25 Hz 

(4, 0, 0) 3107.52 Hz 3134.49 Hz 

with stator teeth, have been chosen. The experimental 

modal analysis have been carried out on both stator core 

stacks. The measurement results have been used for the 

adjustment of the simulation data. 

The numerical modal analysis has been applied in 

conjunction with the finite element method. For that, 

adequate models had to be chosen. The inhomogeneous 

structure has been represented by a linear and homogeneous 

FEM model and the lamination of the stator cores 

has been considered by a transversally isotropic material 

model. 

A study of the influence of the transversally isotropic 

material parameters has been carried out. Thereby, each 

material parameter has been varied and the resulting 

FRFs have been compared. It could be identified, 

which material parameter influences which mode. This 

knowledge is the basis of the adjustment of the simulated 

response characteristics of the stator cores. 

Before adjusting the response characteristics, a comparison 

of the simulated results using a material model 

of isotropic elasticity with the measurement results has 

been carried out for the stator core stack without teeth. 

It could be shown that a material model with isotropic 

elasticity is not appropriate. 

The stepwise adjustment of the simulated FRFs to 

the measured have shown a good match of the response 

characteristic of the stator core stack without teeth up to 

a frequency of 3kHz. However, some measured modes 

could not be identified with the used linear and homogeneous 

numerical model. The approximated material 

parameters have then been validated by a comparison of 

the measured and simulated dynamical behavior of the 

stator core stack with teeth. This validation shows that 

the response characteristic of the measured and simulated 

results did not coincide except for some simulated 

eigenfrequencies. 

In order to improve the numerical model of the stator 

core stack with teeth, a further optimization of the material 

parameters has been carried out. The match of the 

measured and simulated response characteristics is not 

as good as for the stator core stack without teeth. In the 

investigated frequency range of the measured data some 

eigenfrequencies and eigenmodes could not be identified 

in the simulation. 

However, a working method has been introduced,

which describes the three dimensional dynamical behaviour 

of a stator core stack by using a linear and 

homogeneous numerical model. 

REFERENCES 

[1] B. Weilharter, O. Biro, H. Lang, G. Ofner, and S. Rainer, “Validation 

of a comprehensive analytic noise computation method for 

induction machines,” Industrial Electronics, IEEE Transactions 

on, vol. 59, no. 5, pp. 2248 –2257, may 2012. 

[2] J. Li, X. Song, D. Choi, and Y. Cho, “Research on reduction 

of vibration and acoustic noise in switched reluctance motors,” 

in Advanced Electromechanical Motion Systems Electric Drives 

Joint Symposium, 2009. ELECTROMOTION 2009. 8th International 

Symposium on, july 2009, pp. 1 –6. 

[3] S. Rainer, O. Biro, B. Weilharter, and A. Stermecki, “Weak coupling 

between electromagnetic and structural models for electrical 

machines,” Magnetics, IEEE Transactions on, vol. 46, no. 8, pp. 

2807 –2810, aug. 2010. 

[4] B. Weilharter, O. Bi? andro? and, S. Rainer, and A. Stermecki, 

“Computation of rotating force waves in skewed induction machines 

using multi-slice models,” Magnetics, IEEE Transactions 

on, vol. 47, no. 5, pp. 1046 –1049, may 2011. 

[5] H. Jordan, Geräuscharme Elektromotoren. Essen: Verlag W. 

Girardet, 1951. 

[6] R. N. Arnold and G. B. Warburton, “Flexural vibrations of the 

walls of thin cylindrical shells having freely supported ends,” Proceedings 

of the Royal Society of London. Series A, Mathematical 

and Physical Sciences, vol. 197, no. 1049, pp. 238–256, Jun. 1949. 

[7] H. Frohne, “Über die primären bestimmungsgrößen er 

lautstärke bei asynchronmotoren,” Ph.D. dissertation, Technische 

Hochschule Hannover, 1959. 

[8] E. C. Naumann and J. L. Sewall, “An experimental and analytical 

vibration study of thin cylindrical shells with and without longitudinal 

stiffeners,” NASA Langley Research Center, Tech. Rep. 

NASA-TN-D-4705, Sep 1968. 

[9] Y. Stavsky and R. Loewy, “On vibrations of heterogenous orthotropic 

cylindrical shells,” Journal of Sound and Vibration, 

vol.18,no.3,pp.i–i,1971. 

[10] P. L. Tímár, A. Fazecas, J. Kiss, A. Miklós, and S. J. Yang, 

Noise and Vibration of Electrical Machines. New York: Elsevier 

Science Publishing Company, Inc., 1989. 

[11] C. Wang and J. C. S. Lai, “Prediction of natural frequencies of 

finite length circular cylindrical shells,” Applied Acoustics, vol. 59, 

no. 4, pp. 385 – 400, 2000. 

[12] K.-J. Bathe, Finite-Elemente-Methoden, 2nd ed. Berlin Heidelberg: 

Springer, 2002, aus dem Englichen übersetzt von Peter 

Zimmermann. 

[13] R. Gasch and K. Knothe, Strukturdynamik: Band 1: Diskrete 

Systeme. Springer, 1987. 

[14] W. Yixuan and W. Ying, “Research on dynamic characteristics 

of stator core of large turbo-generator,” in Power and Energy 

Engineering Conference (APPEEC), 2010 Asia-Pacific, march 

2010, pp. 1 –4. 

[15] H. Wang and K. Williams, “Effects of laminations on the vibrational 

behaviour of electrical machine stators,” Journal of Sound 

and Vibration, vol. 202, no. 5, pp. 703–715, May 1997. 

[16] M. van der Giet and K. Hameyer, “Identification of homogenized 

equivalent materials for the modal analysis of composite structures 

in electrical machines,” in the 9th International Conference on 

Vibrations in Rotating Machinery, VIRM 2008, Exceter, UK, 2008, 

pp. 437–448. 

[17] J. Roivainen, “Unit-wave response-based modeling of electromechanical 

noise and vibration of electromechanical machines,” 

Ph.D. dissertation, Helsinki University of Technology, Helsinki, 

2009. 

[18] S. Verma and A. Balan, “Experimental investigations on the 

stators of electrical machines in relation to vibration and noise 

problems,” Electric Power Applications, IEE Proceedings -, vol. 

145, no. 5, pp. 455 –461, sep. 1998. 

[19] B. Peeters, H. Van der Auweraer, P. Guillaume, and J. Leuridan, 

“The polymax frequency-domain method: a new standard for 

modal parameter estimation?” Shock and Vibration, vol. 11, no. 3, 

pp. 395–409, Jan. 2004. 


[20] D. J. Ewins, Modal Testing: Theory, Practice and Applications, 

2nd ed. Letchworth, Hertfordshire, UK: Research Studies Press, 

2000.


Proper Location of the Regulating Coil in Transformers 

from Short-Circuit Forces Point of View 

*, O. Sonmez, * B. Duzgun, * G. Komurgoz 

* Istanbul Technical University Electrical and Electronics Faculty, 80626 Istanbul, Turkey 

Abstract—A transformer has complicated network of internal forces acting on and stressing the conductors, support and 

insulation structures. These forces are fundamental to the interaction of current-carrying conductors within magnetic fields 

involving an alternating-current source. Location of the regulating coil in transformer determines electrodynamic forces 

effect on the operational behavior of the transformer. This paper presents design principles of the regulating coil in 

transformers and shows the electrodynamics forces and their deformation results by using finite element method. 

Index Terms—Electrodynamic Forces, Deformation Analysis, FEA, Regulating Coil. 


The transformer is a very critical and costly important 

component in power generation and transmission systems 

as regarding reliable and performance. The capacity of 

transformers is increasing with the rapid development. As 

the voltage level is higher, the time needed to design a 

transformer is of great importance. One of the important 

problems in the design of transformers is radial and axial 

forces, being proportional to the square of the short 

circuit current. By the interaction of leakage field and 

the short circuit current, which makes the windings be 

published or pulled, the huge short circuit force is 

generated in the windings (large power). The leakage 

flux not only causes the additional losses and forces, but 

also creates heating to the internal components. Short 

circuit current is 8 to 10 times the rated current in larger 

transformers and 20 to 25 times in smaller units. Forces 

arising during short-circuit may be as high as ten 

thousand to million N. By the effect of so large forces 

and thermal expansion of wires, the insulation of 

transformer windings can be distorted, even collapsed, 

short circuit error occurs or damage to the clamping 

structures. Furthermore, the location of the tapings has 

the predominant effect on the axial forces since it 

controls the residual ampere turn. Failure of transformers 

due to short circuits is major concern for power utilities 

and manufactures. These hazards can be avoided by 

proper design of windings structure against thermal and 

mechanical strains to prevent permanent deformations 

and movement of windings if forces can be calculated 

correctly. 

In the past, many technical papers have been published 

which give equations for calculation the electromagnetic 

forces acting on the windings in transformers [1-9]. 

Electromagnetic force computations methods have been 

proposed in the literature mainly based on static and 

transient formulations [10]. Classical methods can be 

used to compute the short circuit forces in windings [11]. 

In these methods, it is used simplified configurations 

with some assumptions. Furthermore these methods are 

simple, fast and easy, but not accurate and not suitable 

for predicting the performance of special types of 

Sönmez Transformer Company, 41410 Kocaeli, Turkey 

e-mail: komurgoz@itu.edu.tr 

transformers, especially the axial length of windings is 

not equal [12]. It is, however, obvious that by using 

modern computerized methods, sophisticated methods, it 

is possible to calculate forces acting on the elements of 

winding, the effect of any arrangements of parts and 

asymmetries. If magnetic field is calculated accurately, it 

is possible to define electromagnetic forces in the 

detailed transformer model by using numerical methods, 

Finite Element Methods (FEM), Finite Difference 

Methods (FDM) and Boundary elements (BEM) etc. In 

recent years, a significant development of FEM software 

has enabled the force calculation to be accomplished 

easily in where the winding and tapping arrangement is 

complex. 

This paper concentrates on the use of FEM to models. 

This method provides a comprehensive view of the 

overall transformer mechanic and electromagnetic 

behavior under normal and disturbance conditions. The 

effect of tap winding configurations is also analyzed. The 

results obtained from FEM of transformers using 

MAXWELL® and ANSYS® are validated by the 

mathematical models. 

II. FORCES ACTING ON THE TRANSFORMER 

When the electromagnetic force becomes greater than the 

strength of the windings, the windings will fail. The types 

of failure,Electromagnetic forces, acting on transformer 

can be classified as “radial forces” which develop in the x 

direction and “axial forces” develop in the y direction. 

For the calculation of these forces, both analytical and 

numerical methods are presented such as residual 

ampere-turn method, Robin’s solution, Smythe’s 

solution, calculation using Fourier series, two 

dimensional method of images, FEM, image method with 

discrete conductors etc. [13]. 

Axial forces creates slipping or breakdown of windings as 

a whole standing-up of part of windings, tilting and 

deformation of coils. Radial forces creats buckling 

phenomena of inner windings, excessive elongation of 

outer windings. 

A. Axial Forces 

One of the elementary and simplest methods, residual

ampere-turn method, gives closer approximations and 

reliable results for the calculation of axial forces. 

Concentric windings are separated into two groups and 

each group has balanced ampere-turns. The radial 

ampere-turns produce radial flux which causes axial 

force in the windings as it seen in Figure 1. This 

assumption allows calculation of the axial forces. 

Figure 1: Axial and radial forces in concentric axially nonsymmetrical 

windings [13]. 

The algebraic sum of the ampere-turns of low voltage 

and high voltage windings at any point and at end of the 

windings gives the radial ampere-turns at that point in the 

winding. A curve is plotted for every points called 

residual or unbalanced ampere-turn diagram which the 

method derives its name [12]. It is clear that windings 

without axial displacement and windings have the same 

length have no residual ampere-turns or forces between 

windings. However, there are some internal compressive 

forces and forces on the end coils, although there is no 

axial thrust between windings. 

Figure 2 gives the methodology for the determining 

distribution of radial ampere-turns. ‘a’ is the length 

tapped out at the end of the outer windings. Summation 

of I and II shown in Figure 2(b) are both balanced 

ampere-turn groups. If these groups are superimposed, 

they produce the given ampere-turn arrangement. The 

triangle as shown in Figure 2(c) presents the diagram of 

the radial-ampere turns. This diagram plotted against 

distance along the winding. a(NImax) is the maximum 

value, where (NImax) represents the ampere-turns of either 

the low voltage or high voltage winding. 

Figure 2: Determination of residual ampere-turns [12]. 


Tapings location on the winding has a great effect on the 

axial forces since it controls the residual ampere-turn 

diagram. 

B. Radial Forces 

The radial forces develop due to interaction of coil 

currents with the axial component of its own magnetic 

flux. In a transformer with concentric windings, radial 

forces considered insignificant because, the radial 

strength of the winding is high. Most problems occur 

because of axial forces and axial movement results more 

damage to the winding and insulation than radial 

movements. 

The inner coil is subjected a pressure tends to collapse 

to the core. At the same time, the outer coil is under a 

pressure to extend the diameter of the coil which 

produces a stress as shown in Figure 1. Preferable choice 

in a transformer is circular coils, because they are the 

strongest shape to withstand the radial pressure 

mechanically [14]. 

III. CALCULATION OF ELECTROMAGNETIC FORCES 

A. Short-Circuit Current 

Short-circuit currents on the windings have a 

significant effect on calculation of electromagnetic 

forces. Generally, the short-circuit current is calculated 

for different situations by considering [15]; 

Tapping arrangement 

Fault position 

Short-circuit power combination (network and 

transformer) 

Short-circuit type (e.g. three phase symmetrical) 

To see the effects of the short-circuit current on power 

transformers, the simplest fault scenario, three phase 

short-circuit scenario is investigated. Symmetrical shortcircuit 

current can be calculated according IEC 60076-5 

as [16]; 

I 

U 

Z Z 

 

3 t s 

9 And the amplitude 

is; 

Imax of the first peak of the current 

I Ik 2 10 

max 

 

The factor k is the initial offset of the current and 

2 stands for the peak to r.m.s. value of sinusoidal wave. 

This k 2 factor depends on the X/R ratio and the 

values of k are shown in standards IEC 60076-5 [16]. 

This current is based on the following expression for 

the peak factor; 

R/ X 

2 

k 2 1 

 

e 

 

sin 

2 11

Y1 [kA] 

25.00 

12.50 

0.00 

-12.50 

Curve Inf o 

InputCurrent(Winding_LV_A) 

Setup1 : Transient 

InputCurrent(Winding_LV_B) 


InputCurrent(Winding_LV_C) 


Name X Y 

Phase C_sc 131.5000 21.6972 

Phase A_sc 118.5000 21.6972 

Phase B_sc 105.0000 21.7270 

Input Current LV Model2D_coils ANSOFT 

Phase B_sc 

Phase A_sc 

Phase C_sc 

-25.00 

75.00 87.50 100.00 

Time [ms] 

112.50 125.00 135.00 

Figure 3: Input currents of low voltage windings. 

The given short-circuit has two components as steady 

state and exponentially unidirectional component. In 

Figure 3, applied steady-state and short-circuit currents 

on the windings of the power transformer in Maxwell 

software is shown. The exponentially unidirectional 

component is ignored to make calculations simpler. 

B. Electromagnetic Forces 

Transient analysis allows calculating electromagnetic 

forces for every time step by calculating the leakage flux 

and full field in winding region. Fully coupled dynamic 

physics solution is; 

A 

AJs V Hc vA t 

The differential equation and the boundary conditions 

of transient axial symmetric electromagnetic field can be 

expressed in the cylindrical coordinate as; 


12 rA rA rA 

 

 

: v' Z Z 

v' r r 

Js 

 

' 

t 

13 

S1: rA rA0 

14 rA 

S2: v' Ht 

n 

15 For 2D analysis, the radial and axial components of the 

magnetic flux density can be expressed as; 

A 

Br 

 

z 

B 

0 

16 1 rA 

Bz 

 

r r 

17 When the magnetic flux density is decomposed into its 

radial and axial components; 

 

F J ˆ B rˆB zˆ d F rˆF zˆ 

18 

 

 

 

 

r z r z 

In brief, the force on the power transformer is 

expressed by the Lorentz force as 

 

dF idlB And the radial force of unit length 

F B I dl 

x y 

max 

The axial force of unit length 

F B I dl 

y x 

max 

19 20 21 IV. RESULTS &DISCUSSIONS 

A. Model 

Electrical machines require an accurate mathematical 

model for system simulation and performance evaluation. 

Detailed knowledge of the flux distribution of a 

transformer plays a very important role in a safe 

estimation of the forces of the transformer. Complex 

computer programs are required to obtain a reasonable 

representation of the field in different parts of the 

windings. Using the above models for determinate forces, 

a numerical application (FEM) has been implemented for 

a 25 MVA power transformer. 3-D model of the general 

structure is shown in Figure 4. To reduce computing time 

and avoid excessive use of ram, the insulating materials 

and supporting structure are neglected, besides analyses 

were done in 2-D structure. 

Figure 4: 3-D model of analyzed power transformer. 

The characteristics of the studied transformer are 

presented in Table I and geometry details of the analyzed 

transformer are shown in Figure 5. 

TABLE I 

TRANSFORMER DATA 

Rated Power 25 [MVA] 

Rated Frequency 50 [Hz] 

Rated Voltages 120 / 11 [kV] 

Rated Currents 120 / 1310 [A] 

Turns Ratio 1000 / 159 

Connection Yd11 

Tap setting ± 15 % 

Transformer short circuit voltage (%) 9 

Figure 5: Geometry details of analyzed transformer.

Figure 6: 2-D model of analyzed transformer tapped at upper side. 

B. Electromagnetic Results 

Transformers require an accurate mathematical model 

for system simulation and performance evaluation. In this 

study, magnetic analysis of the designed machines has 

been investigated using Maxwell 2D program and total 

deformations have been investigated using ANSYS® 

program (Figure 6). The simulations were completed 

using the following steps; 

1) Geometric model creation, 

2) The appointment of the materials that make up 

the structure of the machine, 

3) Boundary conditions and mesh process, 

4) The appointment of currents in windings, 

5) Analyze, 

6) Examination of the results. 

In Figure 7 and 8 leakage flux distributions are shown 

for +15% tapping position and -15% tapping position. As 

the leakage flux increases, electromagnetic forces are 

occurring rapidly. 

Figure 7: Leakage flux distribution at +15% tapping position of HV 

windings. 

Figure 8: Leakage flux distribution at -15% tapping position of HV 

windings. 


The graphs of distribution of radial magnetic flux 

density along the transformer window are shown in 

Figure 9 and 10 for +15% tapping, -15% tapping at upper 

part of HV windings, respectively. 

Mag_B [tesla] 

1.50 

1.25 

1.00 

0.75 

0.50 

0.25 

Axial Flux Density Distribution Model2D_coils ANSOFT 

0.00 

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 

Distance [meter] 

Curve Inf o 

Mag_B 


Time='115000000ns' 

Figure 9: Axial flux distribution at +15% tapping position of HV 

windings. 

Figure 9 shows axial flux distribution with respect to 

height of the winding for at +15% tapping position of HV 

windings. To determine the axial forces, it is necessary to 

find the radial flux produced by the radial ampere-turns. 

As seen from figure, axial flux density is approximately 

constant along the winding due to symmetrical windings 

(with fully balanced ampere-turns) 

Mag_B [tesla] 

3.50 

3.00 

2.50 

2.00 

1.50 

1.00 

0.50 

Axial Flux Density Distribution Model2D_coils ANSOFT 

0.00 

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 

Distance [meter] 

Figure 10: Axial flux distribution at -15% tapping position of HV 

windings. 

Curve Inf o 

Mag_B 


Time='115000000ns' 

If there is an asymmetry in the winding heights due to the 

tap position or for some other reasons such as failure, 

flux distribution changes as shown in Figure 10. Flux 

density distribution makes maximum in one place along 

the height of the winding. 

The electromagnetic forces in the winding of the 

power transformer are calculated with the leakage flux 

and transient currents. The radial and axial forces of each 

conductor coil in the HV windings are given in Figure 

11-14. Figure 11 and 12 shows radial and axial forces at 

+15% tapping position of HV windings. 

Radial Forces 

x 105 

3.5 

3 

2.5 

2 

1.5 

1 

0.5 

0 

Radial Forces on the HV Coils for +15% tapping at 118.5 ms 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Coil Numbers 

Figure 11: Radial Forces at +15% tapping position of HV windings.

Axial Forces 

x 104 

8 

6 

4 

2 

0 

-2 

-4 

-6 

-8 

Axial Forces on the HV Coils for +15% tapping at 118.5 ms 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Coil Numbers 

Figure 12: Axial Forces at +15% tapping position of HV windings. 

Axial and radial forces in windings when the windings 

are axially non-symmetrical are calculated as given in 

Figure 13 and 14. 

Due to the symmetry of winding and regular 

distribution of flux, forces values are smaller than 

asymetrical winding arrangement. If there is an 

asymmetry in the winding heights due to the tap position 

(or for some other reasons), the ampere-turn unbalance 

increases and gives rise to forces, and result of this, 

tending to break the winding. 

Radial Forces 

x 105 

6 

5 

4 

3 

2 

1 

0 

Radial Forces on the HV Coils for -15% tapping at 118.5 ms 

1 2 3 4 5 6 7 8 9 10 11 12 

Coil Numbers 

Figure 13: Radial Forces at %-+15 tapping position of HV windings. 

Axial Forces 

x 105 

1 

0 

-1 

-2 

-3 

-4 

-5 

-6 

-7 

-8 

-9 

Axial Forces on the HV Coils for -15% tapping at 118.5 ms 

1 2 3 4 5 6 7 8 9 10 11 12 

Coil Numbers 

Figure 14: Axial Forces at -15% tapping position of HV windings. 

The total body force density and total deformation are 

determined by using ANSYS program and shown in 

Figure 15-18. Figure 15 and 17 shows the effect of forces 

on winding at +15% tapping position of HV windings. 


Figure 15: Total body force density at -15% tapping position of HV 

windings (115 ms) 

Figure 16: Total body force density at -15% tapping position of HV 

windings (115 ms) 

Deformations in windings when the windings are 

axially non-symmetrical are obtained as given in Figure 

16 and18. Total deformation depends on the tap position 

and at +15% tapping position of HV windings, they are 

bigger than which at -15% tapping position of HV 

windings. The location of forces shifts to the upper side 

of the winding. 

Figure 17: Total deformations at +15% tapping position of HV windings 

(115 ms). 

V. CONCLUSION 

In this paper, leakage magnetic field and electrodynamic 

force of the 25 MVA power transformers were analyzed 

under short circuit conditon of the low voltage windings 

of the transformer by using ANSYS® and MAXWELL®

Figure 18 Total deformations at -15% tapping position of HV windings 

(115 ms). 

based on the FEM.Two different conditions when the 

power transformer is under mximum tap are analyzed. 

The location of the regulating coil is changed. 

Afterwards, deformation result is showed by using 

calculated force values. Undesirable stresses values can 

be prevented on the transformers by making appropriate 

coil arrangements. The insertation of tap sections in the 

windings, which produces asymetries between LV and 

HV windings, tends to cause an inrease of radial and 

axial forces annd then damages in transformers. The 

method of calculation offeres a reference to the design of 

transformer. 

LIST OF PRINCIPLES SYMBOLS 

a fractional difference in winding heights 

A magnetic vector potential 

Br, B , Bz components of the flux density of (in Tesla) 

dl 

 

F 

unit length of wire 

force 

h axial height of the winding 

Hc coercive magnetic field strength of the PM 

Ht tangential component of magnetic intensity 

Imax maximum current 

Js current source density 

J - directional short-circuit current density 

ˆr , ˆ and ˆz unit vectors in cylindrical coordinate 

S1 parallel boundary condition 

S2 vertical boundary condition 

U rated voltage 

v velocity 

V electric scalar potential 

Zs short-circuit impedance of the system 

Zt short-circuit impedance of the transformer 

phase angle 

conductivity 

studied domain 

v ' reluctivity 

' conductance 

REFERENCES 

[1] K. Karsai, D. Kerenyi, and L. Kiss, "Large Power Transformers," 

Elsevier Science, December 1987. 

[2] L. Jiao, "The calculation of ampere force on electric power 

transformer under the short circuit situation," Electrical Machines 


And Systems, 2008. ICEMS 2008. International Conference On, 

17-20 October 2008. 

[3] V. Behjat, A. Vahedi, A. Setayeshmehr, and et all. "Identification 

of the most sensitive frequency response measurement technique 

for diagnosis of interturn faults in power transformers," 

Measurement Science And Technology, volume 21, No. 7, 2010. 

[4] L. Jiao, B. Bai, and H. Li, "The calculation of ampere force on 

electric power transformer under the short circuit situation," 

International conference on electrical machines and systems, pp. 

4423-4426, October 2008. 

[5] A. C. de Azevedo, I. Rezende, and A. C. Delaiba, "Investigation of 

transformer electromagnetic forces caused by external faults using 

FEM," Transmission & distribution conference and exposition: 

Latin America, August 2006. 

[6] K. Kurita, T. Kuriyama, K. Hiraishi, and et all. "Mechanical 

Strength of Transformer Windings under Short-Circuit 

Conditions," IEEE Transactions on Power Apparatus and Systems, 

volume 88, No. 3, March 1969. 

[7] K. Hiraishi, Y. Hori and S. Shida, "Mechanical Strength Of 

Transformer Windings under Short-Circuit Conditions," IEEE 

Transactions on Power Apparatus and Systems, pages 2381-2390, 

October 1971. 

[8] R. E. Ayres, G. O. Usry, M R Patel, and et all. "Dynamic 

Measurement during Short Circuit Testing of Transformers, Part 

2- Test Results And Analysis," IEEE Transactions on Power 

Apparatus and Systems, volume PAS-94, pages 198-206, 

March/April 1975. 

[9] M. Heathcote and D.P. Franklin, "The J&P Transformer Book" 

Reed Educational and Professional Publishing Ltd, England, 12th 

edition, 1998. 

[10] G.B. , Kumbhar, and S.V. Kulkarni, "Analysis of short-circuit 

performance of split-winding transformer using coupled fieldcircuit 

approach," Power Delivery, IEEE Transactions On, Issue: 

2, pages 936-943, April 2007. 

[11] M. Waters, "The Measurement and Calculation of Axial 

Electromagnetic Forces in Concentric Transformer Windings," 

Proceedings of the IEE - Part II: Power Engineering, volume 101, 

pages 35-46, February 1954. 

[12] M. F. Beavers and C. M. Adams, "The Calculations and 

Measurement of Axial Electromagnetic Forces on Concentric 

Coils in Transformers," Power Apparatus and Systems, Part III. 

Transactions of the American Institute of Electrical Engineers, 

volume 78, pages 467-477, August 1959. 

[13] M. S. A. Minhas, "Dynamic Behaviour of Transformer Winding 

under Short-Circuits," Ph.D. Thesis, University of the 

Witwatersrand, Johannesburg, November 2007. 

[14] M. G. Say, "The Performance and Design of Alternating Current 

Machines," Sir Issak Pitman & Sons Ltd, London, 3rd edition, 

1958. 

[15] N. Mahomed, "Electromagnetic Forces in Transformers under 

Short-Circuit Conditions," Energize Online, pp. 36-40, March 

2011. 

[16] IEC Standard 60076-5: Power Transformers-Part 5: "Ability to 

withstand short circuit”, 2006.


Robust Design of IPM motors using 

Co-Evolutionary Algorithms 

*Min Li, † André S. Ruela, † Frederico G. Guimarães, † Jaime A. Ramírez and *David A. Lowther 

*McGill University, 3480 University, H3A 2K6 Montreal, Canada 

† Federal University of Minas Gerais, Belo Horizonte, MG 31270-010, Brazil 

E-mail: *david.lowther@mcgill.ca, † jramirez@ufmg.br 

Abstract—A robust design formulation is developed considering the minimization of the torque ripples of an interior 

permanent magnet (IPM) machine in the presence of uncertainties in the values of the design variables. This optimization 

problem is first solved using the worst vertex prediction and a deterministic search. In addition, a competitive co-evolutionary 

algorithm is applied to the minimax optimization problem to find a robust solution, in which one population evolves values for 

the design variables and the other one evolves values in the uncertainty set. Through a worst case analysis, the result from the 

co-evolutionary algorithm is proven to have a more robust performance than that of the non-robust optimization if 

manufacturing tolerance is taken into account. The computation time of the co-evolutionary algorithm may be largely 

reduced through the use of parallel computing environments. 

Index Terms—Evolutionary algorithms, IPM motor, Minimax optimization, Robust design. 


In recent years, interior permanent magnet (IPM) 

motors have become popular for many applications that 

require variable speed and torque. As an alternative to the 

traditional induction motor, IPM motors have the 

advantages of higher efficiencies and lower noise. The 

design of an IPM motor is a complicated task that 

involves the consideration of many different aspects, 

such as the size and the weight of the machine, the 

desired output torque, the cost of the permanent magnet, 

etc. In this paper, we focus on the robust design of IPM 

machines for which the objective is the reduction of 

vibrations and noise of the device caused by errors in 

manufacturing, in order to improve the quality and to 

extend the life of product. 

The idea of robust design was introduced to electrical 

machine design over two decades ago. Robustness is 

often defined in terms of the performance of the device 

being less sensitive to manufacturing errors and 

variations of the operation conditions. Dr. Taguchi, with 

his statistical based methods, is considered as one of the 

pioneers of engineering robust design as he developed 

the foundations of robust design to meet the challenges of 

producing high-quality products. A Taguchi-based 

optimization method has been applied to the design of 

brushless DC motors in [1], where the signal-to-noise 

ratio was used to estimate the robustness of the product. 

A robust shape optimization was applied by Yoon to the 

design of electromagnetic devices in [2], where the mean 

and the standard deviation of the performance were 

treated as multi-objectives for the design problem. This 

paper also employed a sensitivity based approach to 

compute the approximation of the standard deviation and 

took feasibility robustness into account. Another useful 

formulation of robust design is to apply the worst case 

analysis and to optimize the worst performance of the 

objective function in the presence of uncertainties [3] [4]. 

The robustness measure is integrated into the 

optimization process by using a robust target function 

defined on the uncertainty set of the design variables; and 

the vertices of the uncertainty set were used to predict the 

worst value of the objective function. Several different 

robust design formulations were reviewed and discussed 

in [5] and the authors proposed that the standard 

deviation can be approximated using the difference 

between the worst performance and the nominal 

performance and the computation cost could be largely 

reduced. In the recent development of sensitivity based 

robust design optimization, the authors defined a gradient 

index (GI) using the sensitivity of the performance 

function with respect to some critical uncertainty 

variables [6]. This simple and efficient algorithm was 

illustrated with an example of MEMS devices where 

robustness is crucial for high yield rate but information 

on uncertainties is hard to obtain. This gradient index 

based robust design method was also tested with the 

TEAM workshop problem 22 in [7]. The worst case 

analysis and the robust target function have also been 

applied to topological design problems [8], where a 

robust topological gradient (TG) was used to evaluate the 

robustness for a certain topological design. 

In addition to deterministic optimization approaches, 

the worst case analysis based robust design problems (i.e. 

minimax optimization) can also be solved using genetic 

algorithms and evolutionary algorithms [9] [10]. In 

particular, co-evolutionary algorithms have been used to 

solve constrained optimization problems formulated as 

the minimax optimization problem [11, 12]. In this case, 

one population evolves solutions for the problem while 

the second one evolves the terms for the Lagrange 

penalty function. Co-evolutionary methods have been 

applied to robust design as well [13, 14]. In [13] a coevolutionary 

algorithm is used to design a robust 

nonlinear control under uncertainties. In [14], the authors 

have also reviewed a few different formulations of 

competitive co-evolutionary genetic algorithms. 

In this paper, a robust design problem is defined for 

minimizing the torque ripples of an IPM motor while 

considering the uncertainties in the design variables. Two 

different approaches based on the worst case scenario are 

being considered. The first one employs a deterministic 

search and uses the computed sensitivity information to

predict the worst performance of the design. The second 

algorithm is based on a competitive co-evolutionary 

strategy. It introduces a competition between two 

populations, one evolves values for the design variables 

and the other one evolves values for the uncertainty 

variables. Details of the two approaches are presented in 

the second section of the paper and the results from the 

robust design are tested against a non-robust design in 

section III. In the last section, the performance and the 

limitations of the two robust design approaches are 

discussed. 

II. ROBUST TOPOLOGY OPTIMIZATION 

A. Robust Design formulation 

A practical way to treat the robust design problem is to 

use the worst case scenario. A robust objective function 

can be defined as: 

min max f ( ) , (1) 

x U 

( x) 

where f(x) is the nonrobust objective function and U(x) is 

an uncertainty set containing all the possible variations of 

the design variable x, 

n 

U ( x) 

{ R : ( 1 

i 

) xi 

( 1 

i 

) xi} 

. (2) 

Then the worst performance of the objective function f 

can be approximated using the value of f evaluated at one 

of the vertices of U, i.e. the worst vertex. 

max 

U 

( x) 

f ( ) f ( x 

pred 

Nevertheless, in a constrained optimization problem, if 

a nominal optimal solution x* is located close to the 

boundary of the feasible region, which may happen in 

some cases, some perturbed solutions, due to the 

variations of the design variables, will no longer be 

feasible. In such a situation, the robust solution must be 

placed away from the boundary of the feasible region to 

make sure that the entire uncertainty set of x* stays in the 

feasible region. To ensure feasibility robustness, a robust 

constraint function is defined as: 

max 

U 

( x ) 

) 


(3) 

g ( ) 0 , (4) 

where gi(x) are the original constraints for the problem. 

Finally, a robust design formulation using a robust 

objective function and a robust constraint function is 

given as: 

min 

x 

max 

U 

( x ) 

L 

i 

f ( ) 

max g i ( ) 0 , (5) 

s. 

t. 

U 

( x ) 

X x X 

U 

B. Topological gradient 

Applications of topological shape optimization to the 

design of electromagnetic devices are relatively new [15] 

[16]. Unlike the classical shape optimization for which 

only the size and boundary of the design object is 

allowed to vary, in a topological shape design process, 

the topology of the domain can change as well, for 

instance, by drilling an air hole in the domain or filling 

this hole with a different material than the rest of the 

domain. 

The topological gradient (TG) is defined as the 

derivative of an objective function with respect to an 

infinitely small hole Q as: 

obj 

( \ Q( 

x, 

r)) 

obj 

TG ( x) 

lim 

. (6) 

r 

0 ( ) 

where is an arbitrary objective function, x is the center 

of the hole Q, r is the radius of Q, \Q(x,r) is the new 

topology after the small hole is present and () is the 

volume change of the domain , which is the volume of 

Q but with a negative sign. Thus a positive value of 

TG(x) means a negative change of the values of the 

objective function after the small hole Q is created. 

Hence the topological gradient can provide information 

on whether a topology change (creating a small hole in 

the system) will result in a decrease of the objective 

function. 

Now we can define a robust objective function based 

on the worst performance of a non-robust function J due 

to the perturbation of the design variable x as: 

f 

w 

J ( ) , (7) 

max 

U 

( x ) 

and U(x) is the uncertainty set similar to (2), while in a 

topological design using TG, the design variables are the 

three dimensional coordinates, x, of the center of the 

potential topology change. Hence n = 3 and the vector 

= {1 2 3} represents the largest variation to the 

nominal value of x of the three dimensional coordinates. 

This robust objective function fw can be easily estimated 

using the worst vertex of the rectangular uncertainty set 

U. 

If a topology change is taking place in the design 

domain , the scalar objective function J can be 

approximated near the point using a first order local 

expansion, as: 

2 

J ( ) ( 

\ Q( 

, r)) 

( 

) 

TG( 

) 

( r) 

o( 

r ) .(8) 

Since () and (r) are both constants with respect to 

(note that (r) is the volume of Q with a negative sign), 

J() has the largest value where TG() is the smallest. 

Therefore, the worst performance of J due to the 

perturbation of the design variables x is determined by 

the point in the uncertainty set, where TG() has the 

smallest value. Hence we can obtain a robust topological 

gradient as:

TG 

R 

( x) 

min TG ( ) . (9) 

U 

( x ) 

Figure 1 is used to illustrate the robustness of a 

topology. There exist two areas for a potential 

topological change in the design domain . However, the 

first area, which has the highest TG value, is close to a 

large area which has the lowest TG value, i.e. the TG 

value drops drastically in the neighborhood of the first 

area. Therefore the second area, which has the second 

largest TG values, is superior to the first one for a 

topological change in terms of the topological robustness. 

This is, in fact, equivalent to the robust topological 

design using second-order sensitivity analysis. 

Figure 1. Robustness of topology 

C. Worst vertex prediction using sensitivity 

In robust topology optimization, first, we use the 

robust TG to determine the topology change in the 

problem domain. After we “drilled” a hole in the system, 

the boundary of the hole is parameterized and is 

optimized using a shape optimizer. Thus a new 

uncertainty set is defined for the new design variables, 

which are the coordinates of the controlling points on the 

boundary of the hole. However, the robust objective 

function remains the same through the entire design 

process. 

In order to find the worst vertex, we can use the 

information of the gradient computed at the point x. The 

following figure gives an example of the worst vertex 

prediction in R 2 , where the opposite direction of the 

gradient of the objective function points to the worst 

vertex of the uncertainty set. 


Figure 2. Worst vertex prediction using gradient 

D. Algorithm 

Finally, an algorithm for robust topology optimization 

based on topological shape optimization is described as 

follows: 

1. Set the iteration number k =0. 

2. Calculate the robust topological gradient TGR at the 

center of each element. 

3. Define the new domain k where the topology 

changes take place by removing the material in the 

elements where TGR is greater than zero. 

4. Apply standard shape optimization method with a 

robust objective function to determine the shape of the 

boundary. 

5. Check convergence and exit if the optimality 

condition is satisfied. 

6. Set k=k+1 and go to 2. 

Note that uncertainties related to both topology and 

shape are being handled throughout the entire design 

process. 

III. COMPETITIVE CO-EVOLUTIONARY ALGORITHMS 

Co-evolutionary algorithms are suitable for solving 

minimax optimization problems. The robust design 

formulation based on the worst case analysis defined in 

(1) can be generalized as 

min max f ( x, 

u) 

, (10) 

xX uU 

where f(·,·) is an objective or fitness function, x is a 

vector of the design variables and u is a vector of the 

uncertainty variables. Equation (1) is a special case of 

(10) where = x + u. This formulation presents a 

competitive relationship between the two players, where 

the leader selects a value in X and the follower chooses a 

value in U in correspondence. 

In a worst case analysis based robust design, the 

system seeks for the best design under its worst case 

scenario. Thus it is possible to decompose this design 

process into two tasks, to find the design with the best 

performance and to find the worst performance of a

design subject to small variations of the design 

parameters. Similarly, in a competitive co-evolutionary 

algorithm, one population competes with the other 

leading to an “arms race”. The first population (denoted 

as population A in the rest of the paper) represents 

candidate solutions in the design space, which are 

evolving to minimize the objective; while the second 

population (population B) represents disturbances in U, 

an uncertainty set applied to the design variables in order 

to maximize the objective function. In other words, the 

former population provides a solution and the second 

population tries to attack the solution in the worst case 

scenario. Through the evolutions of the two populations, 

a more robust solution, i.e. with the best possible worst 

case performance, can be found after several generations. 

Therefore, this competitive model will guide the 

evolution towards robust solutions. 

A. Fitness computation 

In co-evolutionary algorithms, the two populations, A 

and B, evolve independently, but the fitness evaluations 

of the populations are related to each other. The fitness of 

an individual in one population is evaluated against 

values of the individuals from the other population. For 

instance, the fitness of an individual in population A is 

defined as, 

F( x) 

f ( x, 

u*) 

, (11) 

where u* is the current best solution from population B. 

The goal of evolution of population A is to minimize the 

fitness function F(x). 

The fitness of an individual in population B is assigned 

against the value of the current best individual x* in 

population A. Therefore the fitness function for 

population B is defined as, 

G( u) 

f ( x*, 

u) 

. (12) 

The goal of evolution of population B is to maximize the 

fitness function G(u). 

B. Alternating co-evolutionary GA 

One typical co-evolutionary approach that can be 

applied to the continuous minimax problem is called the 

alternating co-evolutionary GA (ACGA). Figure 3 shows 

a diagram of the ACGA. The two populations (A and B) 

are initialized randomly. The finesses of the individuals 

in one population are evaluated against the other 

population using the functions defined in (11) and (12). 

For instance, after the initialization of population A (i.e. a 

set of random values is assigned to the design variables x 

between the lower bound and the upper bound), the 

algorithm fixes the values of the uncertainty variables u 

and evolves population A for several generations to 

minimize the fitness function F. Then the algorithm 

switches to the evolution of population B, while the 

values of the design variables archived for the best fitness 

are kept and the values of the uncertainty variables are 


updated towards the maximization of the fitness function 

G. This alternating process repeats until the stopping 

criterion is met, e.g. the maximum number of 

generations. 

Figure 3 Alternating Co-Evolutionary GA 

In the algorithm implemented in this paper, both 

populations have a total of = 100 individuals each. At 

the beginning of the execution, these individuals are 

randomly generated, respecting the bounds. The 

candidate solutions are represented by a one-dimensional 

array of real values. An individual has three genes 

representing the values of the design variables or the 

uncertainty variables. Individuals are selected for 

reproduction by means of a binary tournament where two 

individuals are randomly selected and their fitness values 

are compared, and that individual with the best fitness is 

selected for reproduction. 

The crossover operator used in the algorithm is a 

combination of an extrapolation method with a one-point 

crossover method [17]. Each pair of the selected 

individuals undergoes crossover with a recombination 

rate r = 1.0, and produces two offspring. The operator 

performs a blend crossover of the gene at the crossing 

point, with a random factor within the interval [0, 1]. 

After crossover, a mutation operator is applied to the 

offspring, with a mutation rate m = 0.2. The mutation 

operator is very simple. If a gene is under mutation, the 

algorithm randomly generates a new real value within the 

bounds. The genetic algorithm implemented is 

generational, i.e. all offspring replace their parents in the 

next generation. 

All the offspring are then evaluated and the best 

individual is stored and is used as a population 

representative and passed as argument for the opponent’s 

evaluation, as described in equations (11) and (12).

The algorithm runs for a maximum of 100 generations 

and returns the best stored pair (x, u). 

C. Parallel co-evolutionary GA 

The parallel co-evolutionary GA (PCGA), shown in 

figure 4, is very similar to the ACGA, except that the two 

competitive populations evolve simultaneously. As a 

parallel model, this can be implemented easily for a 

parallel computing environment and the computational 

time will be reduced to half in theory. 

Figure 4 Parallel Co-Evolutionary GA 

Several other methods of co-evolutionary algorithms 

using different schemes of the fitness assignment can be 

seen in [18]–[20]. In this paper, the alternating coevolutionary 

GA is used to solve the robust design 

problem. 

IV. RESULTS 

Figure 5 A simulation model of an IPM motor. 


A. Numerical model 

Figure 5 shows a quarter of a 3-phase 4-pole IPM 

machine. The quarter of the rotor core has one slot in the 

center and the rest of the core is made of steel. A 

permanent magnet bar made of NdFeB magnet is inserted 

in the center of the slot. The goal of the design is to find 

the optimal shape of the permanent magnet bar and the 

flux barriers which minimize the torque ripples of this 

motor, while maintaining an adequate average torque. 

The objective function can be defined, without 

considering the manufacturing uncertainties, as, 

Ti 

Tavg 

minmax 

F ( ) 

x u 

i Tavg 

. (13) 

s. 

t. 

minT 

0. 

4Nm 

u 

avg 

The design variables chosen for the optimization are the 

length of the permanent magnet, L, the width of the 

permanent magnet, h and the distance from the 

permanent magnet to the surface of the rotor, d. 

This numerical model is solved using a 2-D nonlinear 

finite element solver (MagNet [21]). At each iteration of 

the optimization, torques are evaluated at different 

positions of the rotor. The rotor mesh is regenerated after 

a new geometry of the rotor is archived during the 

optimization process. 

B. Results from RTO 

The robust topological optimization method is applied 

to a rotor core filled only with iron [22]. The topological 

gradient is evaluated in the design region in order to find 

potential topological changes which reduce the value of 

the cost function. The permanent magnet and air 

materials are created in the region according to the TG 

values, as shown in figure 6. 

Figure 6 Topology of the rotor generated by RTO 

This shows a rough topology with one permanent 

magnet block and two air flux barriers of the design, 

which serves as the starting point of the shape 

optimization process. The system then optimizes the 

shape of the boundaries between different materials in 

order to achieve more accurate values of the geometries. 

The value of the design variables of the robust optimal is: 

H = 1.616 mm, L = 19.879 mm and d = 12.243 mm.

C. Results from ACGA 

It is not practical to combine the topology optimization 

with the alternating co-evolutionary GA due to the huge 

computational cost. Thus the ACGA is applied to the 

model shown in figure 5 to find the robust optimal values 

of the design variables. The manufacturing tolerances of 

the design variables are considered as the uncertainties of 

the problem. The optimal results, from the robust 

formulation, are given in table I. 

TABLE I 

VALUES OF DESIGN VARIABLES OF THE NOMINAL AND ROBUST OPTIMA 

Design 

variables 

Unit Nominal 

optimal 

Robust 

Optimal 

(by RTO) 

Robust 

Optimal 

(by CGGA) 

H mm 1.588 1.616 1.453 

L mm 18.33 19.879 19.345 

D mm 13.426 12.243 12.695 

Nominal 

Performance 

Nm 0.2358 0.2583 0.2547 

Worst 

Performance 

Nm 0.2709 0.2791 0.3039 

Feasibility 

robustness 

No Yes Yes 

Table 1 shows the values of the non-robust optimal and 

the robust optimal. The worst cases of the performances 

are evaluated. The uncertainty is set to be 5% of the 

design variables. 

V. CONCLUSION 

This paper discusses robust design issues and 

formulations for IPM design problems. Two different 

methods have been applied, and they can both find robust 

solutions for the problem. 

The robust topology optimization method employs a 

deterministic search based on the topological gradient 

and the shape sensitivity. The algorithm requires two 

FEM solutions per evaluation of the robust objective 

function (one FEM solution for the nominal cost function 

value and sensitivity calculation, and one for the worst 

performance). In the deterministic search, the maximum 

number of objective function evaluation is set to 200 and 

the total time of execution is around a few hours. Thus 

the method is very efficient and fast to converge. 

However, the robust objective function defined in [1] is 

not necessarily partially differentiable and this may pose 

some difficulties to the optimization. Also, convexity of 

the objective function is not guaranteed, thus the worst 

performance point may be found inside the uncertainty 

set instead of on the corner. The worst performance 

prediction is only an approximation. 

On the other hand, the co-evolutionary GA does not 

rely on the sensitivity information. The algorithm 

maintains a population of the uncertainty variables and 

seeks for the exact worst performance point in the 

uncertainty set. The algorithm maintains two populations 

with 100 individuals for each population. The maximum 

number of generations of evolution is set to 100. The coevolutionary 

GA requires a total number of 20000 FEM 


solutions and the total execution time for the algorithm is 

around 5 days. Although a huge computation time is 

required for the co-evolutionary GA, this algorithm is 

parallelizable, thus the time may be reduced by choosing 

an appropriate scheme of parallel computing. Also, 

depending on the nature of the optimization problems, 

some modifications can be applied to the GA to reduce 

the number of the function evaluations. 

[1] 

REFERENCES 

H. T.Wang, Z. J. Liu, S. X. Chen, and J. P.Yang, “Application of 

Taguchi method to robust design of BLDC motor performance,” 

IEEE Trans.Magn., vol. 35, pp. 3700–3702, Sept. 1999. 

[2] Y. Sang-Baeck, et al., "Robust shape optimization of 

[3] 

electromechanical devices," Magnetics, IEEE Transactions on, 

vol. 35, pp. 1710-1713, 1999. 

C. M. Piergiorgio Alotto, Werner Renhart, Andreas Weber, Gerald 

Steiner, "Robust target functions in electromagnetic design," 

COMPEL: The International Journal for Computation and 

Mathematics in Electrical and Electronic Engineering, vol. 22, pp. 

549 - 560, 2003. 

[4] G. Steiner, et al., "Managing uncertainties in electromagnetic 

design problems with robust optimization," Magnetics, IEEE 

Transactions on, vol. 40, pp. 1094-1099, 2004 

[5] F. G. Guimaraes, et al., "Multiobjective approaches for robust 

electromagnetic design," Magnetics, IEEE Transactions on, vol. 

42, pp. 1207-1210, 2006. 

[6] J. S. Han and B. M. Kwak, "Robust optimization using a gradient 

index: MEMS applications," Structural and Multidisciplinary 

Optimization, vol. 27, pp. 469-478, 2004. 

[7] K. Nam-Kyung, et al., "Robust Optimization Utilizing the Second- 

Order Design Sensitivity Information," Magnetics, IEEE 

[8] 

Transactions on, vol. 46, pp. 3117-3120, 2010. 

Min Li, David A. Lowther, "A robust objective function for 

topology optimization", COMPEL: The International Journal for 

Computation and Mathematics in Electrical and Electronic 

Engineering, Vol. 30 Iss: 6, pp.1829 – 1841, 2011 

[9] G. Spagnuolo, "Worst case tolerance design of magnetic devices 

by evolutionary algorithms," Magnetics, IEEE Transactions on, 

vol. 39, pp. 2170-2178, 2003. 

[10] M. Cioffi, et al., "Stochastic handling of tolerances in robust 

magnets design," Magnetics, IEEE Transactions on, vol. 40, pp. 

1252-1255, 2004. 

[11] H. J. C. Barbosa, A coevolutionary genetic algorithm for 

constrained optimization. Proceedings of the 1999 Congress on 

Evolutionary Computation, CEC 99. vol. 3, 1999. 

[12] J. Kim, Co-evolutionary computation for constrained min-max 

problems and its applications for pursuit-evasion games. 

Proceedings of the IEEE Congress on Evolutionary Computation, 

CEC 2001, vol. 2, pp. 1205-1212, 2001. 

[13] J. M. Claverie, Robust nonlinear control design using competitive 

coevolution, Proceedings of the IEEE Congress on Evolutionary 

Computation, CEC 2000, vol. 1, pp. 403-409, 2000. 

[14] A. M. Cramer, et al., "Evolutionary Algorithms for Minimax 

Problems in Robust Design," Evolutionary Computation, IEEE 

Transactions on, vol. 13, pp. 444-453, 2009. 

[15] K. Dong-Hun, et al., "Smooth Boundary Topology Optimization 

for Electrostatic Problems Through the Combination of Shape and 

Topological Design Sensitivities," Magnetics, IEEE Transactions 

on, vol. 44, pp. 1002-1005, 2008. 

[16] D. H. Kim, et al., "The Implications of the Use of Composite 

Materials in Electromagnetic Device Topology and Shape 

Optimization," Magnetics, IEEE Transactions on, vol. 45, pp. 

1154-1157, 2009 

[17] Haupt, Randy L. Practical genetic algorithms / Randy L. Haupt, 

Sue Ellen Haupt.—2nd ed. p. cm. Red. ed. of: Practical genetic 

algorithms. c1998. “A Wiley-Interscience publication.” ISBN 0- 

471-45565-2. 

[18] Y. Shi and R. A. Krohling, “Co-evolutionary particle swarm 

optimization to solve min-max problems,” in Proc. 2002 Cong. 

Evol. Comput., vol. 2, pp. 1682–1687 

[19] M. T. Jensen, “A new look at solving minimax problems with 

coevolution,” in Applied Optimization, Vol. 86, Metaheuristics:

Computer Decision-Making,M.G. C. Resende and J. Pinho de 

Sousa, Eds. Boston, MA: Kluwer, 2004, pp. 369–384. 

[20] J. Hur, H. Lee, and M.-J. Tahk, “Parameter robust control design 

using bimatrix co-evolution algorithms,” Eng. Optim., vol. 35, no. 

4, pp. 417–426, Aug. 2003. 

[21] MagNet user’s manual 2012, http://www.infolytica.ca 

[22] M. Li, and D. A. Lowther, “Robust Topology Optimization of an 

IPM Motor using Topological Analysis,” proceeding of 

CompuMag2011, 2011 


IGTE Symposium, TU Graz 2012 


Free-form Optimization for Magnetic Design 

Z. Andjelić 1 , S. Sadović 2 

1 ABB Corporate Research, Baden, Switzerland; 

2 Sadovic Consulting, Paris, France 

E-mail: zoran.andjelic@ch.abb.com 

Abstract— The paper presents an approach for free-form optimization of the magnetic problems. The approach is based on 

the novel simple sensitivity analysis and does not require the calculation of the adjoint problem. The solution engine in the 

background is IEM. The developed approach is illustrated on some typical benchmark problems. 

Index Terms— Free-form optimization, IEM, Sensitivity analysis 

Also, in free-form optimization the meshing of the 


analysed objects using mesh generator is performed 

When speaking about free-form optimization of industrial 

only in the first iteration. For all further iterations the 

problems we distinguish two different approaches: direct 

mesh is updated directly in the Analysis module. 

and indirect approach. In direct approach we try to As we use the non-gradient approach, the calculation 

minimize the maximal field quantities laying directly on time is much faster than with the gradient approach, 

the interface between different media by changing the requiring the costly calculation of the gradients. 

form of those interfaces in the normal direction. Typical As mentioned above we distinguish between the direct 

applications are optimization of the structural problems and indirect approaches for free-form optimization. One 

[1], or dielectric design of electrical apparatus [2], [3], of the additional key differences between direct and 

[4]. In indirect approach we are searching for the indirect approach is that for the optimization problems 

prescribed distribution of the objective function in the following the direct approach it is not necessary to 

space of interest by changing the shape of the structures calculate any sensitivity function [2], [3]. In the current 

outside of such space of interest. In this paper we discuss contribution we shall focus us on the indirect approach 

in more details the second approach, illustrated by some illustrated by some applications in magnetic design. It is 

typical benchmark problems. 

important to note that the proposed approach is 

independent of the application class and can be used for 

optimization of not only magnetic but also dielectric, 

acoustic or similar class of problems. It also has a generic 

character and can be used having FEM or other numerical 

method as the numerical engine in the background. In the 

present contribution we use IEM (Integral Equation 

Method) for the solution of the magnetostatic field 

problems. 

II. FREE-FORM OPTIMIZATION 

For automatic shape optimization we follow a nonparametric, 

non-gradient approach, which in 

combination with IEM (Integral Equation Method) 

enables fast and robust optimization of the real-world 3D 

problems [5]. The main benefits of such an approach 

comparing to the standard parametric, gradient-based 

approaches are: 

The applied procedure usually leads to the global 

optimum contrary to the parametric optimization 

approach where the optimum can be searched only 

within the “parametric space” defined by the design 

parameters (radii, distances, etc.). 

Due to the fact that we don’t need as input any design 

parameter it is not necessary to “communicate” with 

the CAD system during the optimization iterative 

procedure. As shown in Figure 1 the iterative 

framework in free-form optimization requires 

communication only between Analysis and 

Optimization module, whereby by parametric 

optimization in each iteration a new set of parameters 

has to be generated in CAD tool, meshed in mesh 

generator and then passed to Analysis module for 

further processing. 

Figure 1: Free-form vs. parametric optimization framework 

III. IEM FORMULATION 

The analysis of the non-linear problems in 

magnetostatic by IEM is performed using the improved 

procedure described initially in [6], and more detailed 

elaborated recently in [7]. The magnetic field in any space 

point can be found as: 

J M 

H H H 

J 

where H is a field component produced by the excitation 

M 

current in free space and H is a field produced by the 

magnetic charges. The first field component can be easily 

calculated by Bio-Savarot law. For the calculation of the 

second one we use the formula: 

M 1 1 

H J 1dSJ N 

2dVN 

(2) 

4 J K 1 dS 

4 

N 2 2dV 

N (2) 

4 4 

 

K 

S 

J 

VN 

where J and N are the fictitious surface and volume 

magnetic charges, and 1 K and K 2 are the kernels of the 

3 

type r / r . The surface charges are obtained by solving 

second Fredholm integral equation: 

(1)


1 1 

2 (3) 

 

(3) 

2 

1 

 

I 

J 

J GdS 1 I 2 I I N NGdV 2 2d 

N 

I 2 2 2 

s V N 

JGdS 1 2H 

n 

Here 1 G and 2 

12 

 

1 2 

G are the kernels of the type 

3 

rn/ r and 

where the 1 and 2 are the relative 

permeabilities of the surrounding media and magnetic 

materials. When solving the linear problems the last term 

on the right-hand side of equation (3) is equal to zero. 

Here is important to stress some of the main features 

of IEM when solving the non-linear magnetostatic 

problem. In spite of the fact that it is necessary to mesh 

the volume of the non-linear magnetic parts, the number 

of unknowns for the non-linear problem is same as the 

number of unknowns for the linear one. This is due to the 

fact that the non-linear contribution - second term on the 

right-hand side of (3) - appears just as the correction term 

and is calculated throughout the iteration procedure from 

the previous iteration. Also, as the material parameter 

appears only in the diagonal term, the matrix 

calculation is performed only during the first iteration. In 

other iterations only the diagonal term is changed together 

with the RHS term taking into account the contributions 

due to volume charges. 

IV. INDIRECT APPROACH 

Contrary to the direct approach where it is not necessary 

to calculate any sensitivity functions, in indirect approach 

this function has to be established. To establish such 

function we use the analogy to the sensitivity analysis 

typically used in the signal-processing (SP) problems. In 

SP the objective of controller design is to keep the error 

between the controlled output and the external input as 

small as possible. In signal processing the sensitivity 

function S(s) is typically calculated as: 

Es () 

Ss () ; Es () Rs () Ys 

() (4) 

Rs () ds () 

where E(s) is feedback error, R(s) and d(s) are the 

external input and disturbance. 

To calculate the sensitivity for our optimization tasks we 

use the analogy to the above SP scheme. Here we take as 

example the quantities from the magnetic problem, 

Figure 2. 

Figure 2: Sensitivity calculation scheme for optimization 

tasks 

In magnetic problems the magnetic field in the space 

point of interest can be calculated using BEM [5] as: 


1 

H( 

j) ( i) K( i, j) d 

(5) 

4 

 

Sensitivity of changing the field H in the space of interest 

with the changes of the geometry of the magnetized body 

can then be obtained as: 

H H 

S 

H H 

G C 

G C 

max 

In the above case the external input H G is a given i.e. 

prescribed (desired) field distribution in the space of 

interest, H C is a calculated field in the same space. The 

C 

disturbance H is calculated as: 

max 

C C 

max max 

(6) 

H ( i) max[ H ( i, j), j1, N ]; (7) 

The displacement vector D of the moving of the mesh 

nodes can then be calculated as: 

D Sn More information on the calculation of the sensitivity 

function for free-form optimization tasks can be found in 

[8]. 

For illustration the above procedure has been used to 

optimize the Die mold problem, Example 1 and Field 

homogenization problem, Example 2. 

V. EXAMPLE 1: DIE MOLD OPTIMIZATION 

This is a TEAM benchmark problem No. 25 used up to 

now for the benchmarking of the codes dealing with 2D 

parametric optimization [9], Figure 3. 

Figure 3: TEAM benchmark problem No. 25 

Here we use the same model as a 3D problem adding the 

extrusion in y-direction of 200 mm, Figure 4. The 

objective is to obtain the homogeneous radial field 

distribution within the cavity shown in Figure 3. One of 

the die molds is keept fix (inner cylinder) and the other 

one is in our approach subjected to the free-optimization 

process in order to get the radial field distribution in the 

j 

8


cavity. 3D model is shown in Figure 4 and the detailed 

2D view in Figure 5. 

Figure 4: 3D model of the Team problem No. 25 

Figure 5: Details of the Team problem No. 25 

Applying module for free-form optimization governed by 

the above given approach for sensitivity calculation we 

have after 24 iterations obtained the optimal form of the 

magnetic poles, Figure 6 (in red). 

Figure 6: Outer magnetic mold before and after optimization 

Such new form of magnetic poles has provided a desired 

radial field distribution in the cavity. Figure 7 shows the 

form of the magnetic poles befor otpimization, after 10 th 

iteration and as the optimal form after 24 iterations. The 

field vectors illustrate the changes of the field during the 

optimization process. Only at the end of the die molds 

some deviations are observed caused by the end-region 

field disturbances. 


Figure 7: Magnetic field homogenization during the 


VI. EXAMPLE 2: AIR GAP FIELD HOMOGENIZATION 

In this example the objective function is to achieve the 

homogeny field distribution over the prescribed space of 

interest lying in the air gap between the magnetic poles, 

Figure 8. 

Figure 8: Model of the magnetic structure 

The core is made of the material with 1500 

, and is 

excited by the current-carrying coil with I=12240A. 

Before doing any optimization the magnetic field 

distribution over the space of interest is shown in Figure 9 

and Figure 11, a.). The field over the space of interest 

varies from 35737 A/m to 57555 A/m. As the 

optimization objective we define here the desired value of 

the homogeneous field over the space of interest 

(50x50mm) as H d =50000 A/m. After applying the 

optimization modus governed by the above sensitivity 

calculation, we have obtained after 37 iterations the 

optimal form of the magnetic pole shoes that deliver the 

desired field distribution within the error less than 10%, 

Figure 10.


Figure 9: Magnetic field distribution over the space of 

interest before any optimization 

Figure 10: Magnetic field distribution over the space of 

interest after 37 iterations. The magnetic poles have changed 

the form in order to provide prescribed homogeneous field of 

50000A/m. 

Figure 11 shows in more details the field distribution over 

the space of interest before (a) and after optimization (b). 

Figure 11: Detailed view on the field distribution over the 

space of interest before (a) and after (b) optimization 

The field variation for optimal design (with threshold 

error of 10%) is between Hmin = 46489A/m and 

Hmax=55027 A/m. 


The paper elaborates the procedure for free-form 

optimization of magnetic problems. The procedure is 


based on the novel approach for the simple sensitivity 

calculation. The proposed approach does not require 

calculation of the adjoint problem and has a generic 

character with respect to both the classes of the 

application (magnetic, dielectric, acoustics...) and the 

numerical methods used within the simulation engine 

(BEM, FEM). 

REFERENCES 

[1] R. Meske: “Non-parametric gradient-less shape optimization in 

solid mechanics”, Shaker Verlag,2007, ISBN 978-3-8322-6373-7 

[2] Z. Andjelic, S. Sadovic: “Reduction of breakdown appearance by 

automatic geometry optimization”, IEEE Conf. on El. Insulation 

and Dielectric Phenomena, Vancouver BC, Canada, 2007 

[3] Z. Andjelic, D. Pusch, T. Schoenemann, S. Sadovic: “Multi-load 

optimization in electrical engineering design, Part 1: Simulation, 

EngOpt 2008- Int. Conf. on Engineering Optimization, Rio de 

Janeiro, Brazil, 01-05. June 2008 

[4] Z. Andjelic, S. Sadovic, Jean-Claude Mauroux: “Preventing 

breakdown by direct optimization approach”, IEEE Int. Power 

Modulator and High Voltage Conf, San Diego, CA-June 3-7, 

2012 

[5] Z. Andjelic at al: “BEM-based simulations in engineering 

design”, In Boundary Element Analysis, Mathematical Aspects 

and Applications, Springer Verlag 2007, ISBN: 3-540-47465-X 

[6] B. Krstajic, Z. Andjelic, S. Milojkovic, S. Babic, S. Salon: 

“Nonlinear 3D magnetostatic field calculation by the integral 

equation method with surface and volume magnetic charges”, 

IEEE Tran. on Mag., vol.28, No.2, March 199 

[7] Z. Andjelic, G. Of, O. Steinbach, P. Urthaler: “Fast BEM for 

industrial applications in magnetostatic”, in Lecture Nodes in 

Applied and Computational Mechanics, Springer-Verlag, Vol. 63, 

2012 

[8] Z. Andjelic: “Simple sensitivity approach for optimization tasks in 

electrical engineering”, OIPE Workshop, Gent, Belgium, 2012 

[9] N. Takahashi, M. Natsumeda, M. Otoshi and K. Muramatsu: 

“Examination of optimal design method using die press model 

(problem 25)”, COMPEL 17 5/6, 1982


Optimization for ECT treatment planning 

1 P. Di Barba, 3 L.G. Campana, 2 F. Dughiero, 3 C.R. Rossi, 2 E. Sieni 

1 Department of Industrial and Information Engineering, Pavia University, via Ferrata 1, 27100 Pavia (Italy) 

2 Department of Industrial Engineering, Padova University, via Gradenigo 6/A, 35131 Padova (Italy) 

3 Melanoma and Sarcoma Unit, Istituto Oncologico Veneto (IOV),Via Gattamelata 64, 35128 Padova (Italy) 

E-mail: paolo.dibarba@unipv.it,{fabrizio.dughiero, carlor.rossi, elisabetta.sieni}@unipd.it, luca.campana@ioveneto.it 

Abstract—Treatment planning of Electrochemotherapy (ECT) is designed by means of a genetic multi-objective optimization 

method: the needle position maximizing the electric field in the treated volume is searched for. NSGA algorithm is coupled 

with penalty function technique in order to identify the constrained Pareto front to select the best compromise solutions and 

discard the unfeasible ones. 

Index Terms—Electrochemotherapy, conduction field, Finite Element, Pareto front, NSGA. 


ECT uses pulses of electric field in order to improve the 

delivery of chemotherapeutic drugs into cancer cells [1]- 

[2]. A suitable electric field intensity is able to induce cell 

membrane permeabilization that improves the 

chemotherapy drug delivery. However, a high electric 

field intensity in healthy tissues, and in some critical 

regions like e.g. large vessels, is to be prevented. A 

conduction electric field is applied to tumor tissues by 

means of needle electrodes suitably positioned in the 

target volume. In order to improve the therapy success, 

the positioning of electrodes is considered an 

optimization problem. The research group in Ljubljana 

University has proposed some solutions to optimal 

electrode positioning in deep-seated tumor like in [3-6]. 

In this paper a multiobjective optimization method, based 

on a modified NSGA-II algorithm, that includes 

constraints and penalty functions in order to prevent 

unfeasible solutions, is proposed for the optimal 

positioning of needles in the tumor mass [7-10]. The 

optimization problem is solved using a 2D model of 

steady conduction field. 

II. CLINICAL ECT 

ECT is a medical therapy based on cell electroporation 

for patients with cutaneous and subcutaneous tumor 

nodules on the basis of the synergistic association of 

locally applied brief electrical currents (reversible 

electroporation) and low permeant anticancer agents [11- 

13]. Electroporation is a local electric treatment that uses 

a physical behavior of cells when a pulsed electric field is 

applied in order to open some pores on the cell 

membrane. Those opening can be used as channels as a 

delivery system to enhance the penetration of drugs, 

genes, or molecular probes into cancer cells. This is an 

applied electrical fields with suitable intensity that 

increase cell membrane permeability [14-17]. Figure 1 

shows the most important phase of chemotherapy drug 

administration using ECT technique: In the phase I of the 

treatment the clinician injects the drug (e.g. bleomicine), 

then during phase II he applies the electric pulse, and 

finally the drug penetrate the membrane cells. 

Since its development at the Institute Gustave Roussy, 

this technique has been quickly tested in the clinical 

setting and recently is entered in the clinical practice [11- 

12], [18-21]. At Melanoma and Sarcoma unit of the 

“Istituto Oncologico Veneto” (IOV) in Padova, Italy, 

clinical application of ECT using standard electrodes [23] 

has shown yet satisfying results [22]. Standard electrodes 

are a set of 7 needles with a length between 10 and 30 

mm on a rigid support, [23]. The ECT equipment 

manufacturer produces also a long needles machine that 

can be used to treat with ECT some deep-seated tumors 

like sarcoma [23-25]. In this case the clinician implants 

single 20 cm length electrodes on the tumor mass 

accordingly to medical image of the tumor and clinical 

practice. So, in the case of flexible long-needle 

equipment, it is of interest to improve the therapy success 

studying the electric field produced by some 

configurations of electrodes implanted on the tumor mass 

using optimization algorithms. 

ECT electrode 

E 

Skin surface 

Figure 1: Description of the ECT application. 

III. DIRECT PROBLEM: ELECTRIC FIELD ANALYSIS 

In general, the case study models three regions: the 

tumor, T, with an average radius of 3 cm, the 

surrounding healthy tissue, H, and a region close to the 

treated region that might be a critical one, C. Each 

region is attributed the relevant electric conductivity [3]. 

The needle electrodes are represented as a set of nine 

points. In particular, the fixed main electrode is located in 

the center of the lesion whereas the other eight electrodes, 

the ones that can be moved, are around the central one. 

The ECT process forces a sequence of voltages in the 

range 1 to 3 kV for each electrode pair. The imposed 

voltages represent the boundary conditions of the field 

problem. Then, the electric field is computed by means of 

the finite-element method (FEM) solving a steady 

conduction problem for each electrode pair: specifically, 

16 field analyses on the same grid are needed to compute 

the electric field for each needle configurations [26]. The

solved equation is: 

V 

0 

(1) 

imposing Neumann condition on electric scalar potential 

on the domain boundary: 

V 

n 

0 

And finally the electric potential has been fixed to a 

constant value, U, in two of the ne electrodes in the 

following way 

V U 

i 

0 V U 

( 

i, 

j) 

i j i 1,... 

n 

j 

e 

U 

i 

Then, given an electrode configuration, solving the direct 

problem implies to repeat the field analysis, i.e. solving 

(1), for all possible (i,j) pairs of electrodes. 

 

H 

C 

T 

Electrode 

U i 

Main electrode 

Figure 1: Geometry of the 2D conduction field. 

Given all the ne field analyses considering the mesh 

nodes of each problem region the highest value of the 

electric field is searched for each node of the problem 

domain and recorded in sets named Emax(i), one for each 

of examined region. 

IV. INVERSE PROBLEM: OPTIMAL ELECTRODE 

POSITIONING 

The therapy efficacy depends on the electric field 

intensity applied to the cells. In some practical cases the 

proximity to a prescribed therapeutic value of the 

temperature is searched for [27-28], whereas in our case 

the overcoming of a given electric field threshold is to be 

controlled. The ideal configuration of needle electrodes is 

the one that maximizes the sub-volume of the tumor 

region covered with an electric field intensity over the 

electropermeabilization threshold [3], ERE, and 

simultaneously minimizes the volume of healthy tissues or 

critical organs that have an electric field higher than ERE 

[29-30]. Accordingly, the following objective functions, 

to be minimized, have been defined: 

 

 

N E ( E E 

f1( 

E) 

100 

 

1 

N E, 

tot 

RE 

) 

 

 

 

that represents the complementary sub-volume of the 

tumor region for which the electric field is under ERE, 

evaluated as the number of nodes, NE, in which the 

U j 

(2) 

(3) 

(4) 


electric field intensity is higher than ERE. NE,tot is the total 

number of nodes in which the electric field is evaluated in 

the tumor region. The design criterion considered 

evaluates the nodes of the healthy tissue region or the 

region of a critical organ (e.g. large vessel), in which the 

electric field exceeds a prescribed threshold ETH: 

g( 

E, 

E 

TH 

) 

N ( E E 

N 

E 

TH 

100 

(5) 

E, 

tot 

) 

Starting from (5) three objective functions have been 

generated. Namely: 

(a) g( 

E, 

E ) 

f (6) 

2 IRE 

in which the threshold of electric field is fixed to the 

irreversible electroporation value, EIRE = 10 5 V/m, and is 

computed in the tumor region T; 

(b) g E, 

E ) 

f (7) 

3 ( ETH 1 

in which the threshold of electric field is fixed to the 

reversible electroporation, ETH1 = 410 4 V/m, computed 

on the healthy tissue H. In this case it is desirable that 

the electric field does not exceed the threshold ETH1 in 

order to preserve healthy tissue; and finally: 

(c) g E, 

E ) 

f (8) 

4 ( ETH 2 

In this case the electric field cannot exceed the ETH2 = 10 3 

V/m in the critical region C to prevent the damage of 

critical organs. Generally this threshold is chosen lower 

than electroporation threshold in order to ensure an 

electric field lower the ERE. 

All the objective functions (4) and (6)-(8) are computed 

using the Emax(i) set of values in the corresponding 

region of the computation domain. 

A1 

A2 A2 

Figure 2: f1 and f3 optimization goal. 

Accordingly, a sequence of bi-objective optimization 

problems have been considered and solved: find the 

Pareto front minimizing the couple of functions (f1, fk) 

with k=2,3 and 4 subject to the solution of the direct 

problem (1) and a set of geometrical constraints on the 

electrode position. For instance the minimum distance 

between two electrodes must be greater than 10 mm. 

Constraints have been incorporated in the objectives 

functions by means of a penalty term as in [7]. For 

instance Figure 2 shows the f1 and f3 optimization goal: f1 

tends to maximize the area A1, whereas f3 tends to 

A1

minimize the area A2. Moreover, Figure 3 shows the 

penalty constraint effect: if the non-penalty algorithm is 

used, two electrodes can be at a distance lower than the 

prescribed minimum (10 mm) like the one marked with a 

circle in Figure 3 (a). In contrast, if the penalty algorithm 

is used, too near electrodes configuration are discarded 

and a possible configuration is like the one in Figure 3 

(b). 

(a) (b) 

Figure 3: Penalty algorithm effect. 

V. RESULTS 

Results of some optimized configurations are here 

presented. 

A. Case 1: penalty vs non-penalty 

The optimization problem considers the electric field in 

the tumor region T that must exceed the electroporation 

threshold ERE (f1) and the electric field in the healthy 

tissue region, T, that must be lower than the 

electroporation threshold ETH1=ERE (f3). 

In this case, results obtained using penalty algorithm are 

compared with results obtained using non-penalty 

algorithm. Figure 4 reports the two Pareto fronts obtained 

starting from the same initial population and using the 

two algorithm: the Pareto front is reshaped. 

Figure 4: Pareto Front for the case 1 using penalty and 

non-penalty algorithm. 

200103 E [V/m] 

15010 3 

10010 3 

5010 3 

0,00 

Not feasible 

Figure 5: Optimized electrodes configurations: Electric 

field in the examined region using (a) penaltyand (b) nonpenalty 

algorithm. 


In Figure 5 the highest value of the electric field obtained 

at each domain point applying the whole sequence of 

electrodes discharges during an ECT treatment is reported 

for the tumor region, Emax(T), and the healthy tissue, 

Emax(H). The corresponding electrodes position is also 

indicated by dots. 

A. Case 2: preventing irreversible electroporation 



threshold ERE (f1) and must be lower than the irreversible 

threshold, EIRE (f2). 

Figure 6 reports the Pareto front obtained starting from an 

initial population and using the penalty algorithm. 

Figure 6: Pareto Front for the case 2 using penalty 

algorithm. 

In Figure 7 the highest value of the electric field obtained 

at each domain point applying the whole sequence of 

electrodes discharges during an ECT treatment is reported 

for the tumor region, Emax(T). The corresponding 

electrodes position is also indicated by dots. 

200103 E [V/m] 

15010 3 

10010 3 

5010 3 

0,00 


field in the examined region using penalty algorithm. 

In this case electrodes are positioned in the healthy tissue 

because the irreversible electroporation is avoided in 

order to prevent cells necrosis. 

A. Case 3: preserving critical organ (blood vessel) 



threshold ERE (f1) and the electric field in the critical 

region, C, that must be lower than the threshold ETH1 

(f4). 

Figure 8 reports the Pareto front obtained starting from an 

initial population and using the penalty algorithm. In 

Figure 9 the highest value of the electric field obtained at 

each domain point applying the whole sequence of 

electrodes discharges during an ECT treatment is reported

for the tumor region, Emax(T) and the critical region, 

Emax(C). The corresponding electrodes position is also 

marked by black and green dots. 

Figure 8: Pareto Front for the case 3 using penalty 

algorithm. 

200103 E [V/m] 

15010 3 

10010 3 

5010 3 


field in the examined region using penalty algorithm. 

In this case the electrode are far from the critical region, 

whereas in Figure 5 are close and even inside the critical 

region. Then different objective functions allow to 

identify different electrodes configurations depending on 

the problem constraints. 


The NSGA-II algorithm modified including constraints 

and penalty function has been applied to design ECT 

electrodes positioning. Various objective function pairs 

have been implemented in order to compare results 

obtained considering different problem targets. 

VII. ACKNOWLEDGES 

This project has been developed in the frame of a Post- 

Doc granted by the Padova University, Italy. 

REFERENCES 

[1] G. Sersa, D. Miklavcic, M. Cemazar, Z. Rudolf, G. Pucihar, M. 

Snoj, “Electrochemotherapy in treatment of tumours”, European 

J. Surgical Oncology, 34 (2), 232–240, 2008. 

[2] S. Corovic, A. Zupanic, D. Miklavcic, “Numerical modeling and 

optimization of electric field distribution in subcutaneous tumor 

treated with electrochemotherapy using needle electrodes”, IEEE 

Trans. on Plasma Science, 36(4), 1665–1672, 2008. 

[3] B. Kos, A. Zupanic, T. Kotnik, M. Snoj, G. Sersa, D. Miklavcic, 

“Robustness of treatment planning for electrochemotherapy of 

deep-seated tumors”, J. Membrane Biol., 236(1), 147–153, 2010. 

[4] D. Sel, A. M. Lebar, D. Miklavcic, “Feasibility of employing 

model-based optimization of pulse amplitude and electrode 

distance for effective tumor electropermeabilization”, Biomedical 

Engineering, IEEE Trans, 54(5), 773–781, 2007. 

[5] D. Miklavcic, M. Snoj, A. Zupanic, B. Kos, M. Cemazar, M. 

Kropivnik, M. Bracko, T. Pecnik, E. Gadzijev, G. Sersa, 

“Towards treatment planning and treatment of deep-seated solid 

tumors by electrochemotherapy”, BioMedical Eng. On Line, 9(1), 

0,00 


10, 2010. 

[6] A. Županič, S. Čorović, D. Miklavčič, “Optimization of electrode 

position and electric pulse amplitude in electrochemotherapy”, 

Radiology and Oncology, vol. 42(2), 93–101, 2008. 

[7] P. Di Barba, L.G. Campana, F. Dughiero, C.R. Rossi, E. Sieni, 

“Optimal needle positioning for electrochemotherapy: a 

constrained multiobjective strategy”, to appear in Proc. CEFC 

2012. 

[8] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, “A fast and elitist 

multiobjective genetic algorithm: NSGA-II”, IEEE Trans. 

Evolutionary Computation, vol. 6(2), 182–197, 2002. 

[9] P. Di Barba, Multiobjective Shape Design in Electricity and 

Magnetism. Springer, 2010. 

[10] P. Di Barba, F. Dughiero, E. Sieni, “Field synthesis for the 

optimal treatment planning in Magnetic Fluid Hyperthermia”, 

Archives of Electrical Engineering, vol. 61(1), 57–67, 2012. 

[11] L.M. Mir, “Terapeutic perspectives of in vivo cell 

electropermeabilization”, Bioelecrtochemistry, 53, 1–10, 2000. 

[12] Belehradek M, Domenge C, Luboinski B, et al. 

“Electrochemotherapy, a new antitumor treatment. First clinical 

phase I–II trial”, Cancer, 72, 3694–700, 1993. 

[13] L.M. Mir, S. Orlowski. “Mechanisms of electrochemotherapy”. 

AdvDrug Del Rev. 35,107–18, 1999. 

[14] C. Chen, S.W. Smye, M.P. Robinson, et al. “Membrane 

electroporation theories: a review”, Med Biol Eng Comput. 44, 5– 

14, 2006. 

[15] S. Somiari, J. Glasspool-Malone, J.J. Drabick, et al. “Theory and 

in vivo application of electroporative gene delivery”, Mol Ther., 3, 

178–87, 2000. 

[16] R. Heller, M.J. Jaroszeski, A. Atkin, et al. “In vivo gene 

electroinjection and expression in rat liver”, FEBS Lett.,389, 225– 

8, 1996. 

[17] M. Cemazar, G. Sersa, “Electrotransfer of therapeutic molecules 

into tissues”, Curr Opin Mol Ther., 9, 554–62, 2007. 

[18] B.J. Mossop, R.C. Barr, W. Henshaw, et al. “Electric fields in 

tumors exposed to external voltage sources: implication for 

electric field mediated drug and gene delivery”, Ann Biochem 

Eng., 34, 1564–72, 2006. 

[19] A. Gothelf, L.M. Mir, J. Gehl, “Electrochemotherapy: results of 

cancer treatment using enhanced delivery of bleomycin by 

electroporation”, Cancer Treat Rev., 29, 371–87, 2003. 

[20] G. Sersa, “The state-of-the-art of electrochemotherapy before the 

ESOPE study; advantages and clinical use”, EJC Suppl., 4, 52–9, 

2006. 

[21] M. Snoj, M. Cemazar, T. Srnovrsnik, S. P. Kosir, G. Sersa, “Limb 

sparing treatment of bleeding melanoma recurrence by 

electrochemotherapy”, Tumori, 95(3), 398–402, 2009. 

[22] L. Campana, S. Mocellin, M. Basso, O. Puccetti, G. De Salvo, V. 

Chiarion-Sileni, A. Vecchiato, L. Corti, C. Rossi, D. Nitti, 

“Bleomycin-Based Electrochemotherapy: Clinical Outcome from 

a Single Institution’s Experience with 52 Patients”, Annals of 

Surgical Oncology, 16(1), 191–199, 2009. 

[23] Cliniporator, Igea: http://www.igea.it (last visited October 2012). 

[24] B. Kos, A. Zupanic, T. Kotnik, M. Snoj, G. Sersa, D. Miklavcic, 

“Robustness of treatment planning for electrochemotherapy of 

deep-seated tumors”, J. Membrane Biol., 236(1), 147–153, 2010. 

[25] I. Edhemovic, E. M. Gadzijev, E. Brecelj, D. Miklavcic, B. Kos, 

A. Zupanic, B. Mali, T. Jarm, D. Pavliha, M. Marcan, G. 

Gasljevic, V. Gorjup, M. Music, T. P. Vavpotic, M. Cemazar, M. 

Snoj, G. Sersa, “Electrochemotherapy: a new technological 

approach in treatment of metastases in the liver”, Technol. Cancer 

Res. Treat., vol. 10(5), 475–485, 2011. 

[26] Cedrat: http://www.cedrat.com/ (last visited October 2012). 

[27] I. M. V. Caminiti, F. Ferraioli, A. Formisano, R. Martone, 

“Adaptive Ablation Treatment Based on Impedance Imaging”, 

IEEE Tran, Magn., 46(8), 3329–3332, 2010. 

[28] I. M. V. Caminiti, F. Ferraioli, A. Formisano, R. Martone, “Three 

dimensional optimal current patterns for radiofrequency ablation 

treatments”, COMPEL, 31(3), 985–995, 2012. 

[29] P. Neittaanmaki, M. Rudnicki, A. Savini Inverse problems and 

optimal design in electricity and magnetism, Oxford Science 

Pub., 1996. 

[30] P. Di Barba, F. Dughiero, E. Sieni, “Synthesizing Distributions of 

Magnetic Nanoparticles for Clinical Hyperthermia”, IEEE Trans. 

Magn., 48(2), 263–266, 2012.


Investigation of the Electroporation Effect 

in a Single Cell 

Jaime A. Ramirez ∗ , William P.D. Figueiredo ∗ , Joao Francisco C. Vale ∗ , Isabela D. Metzker ∗ , Rafael G. Santos ∗ , 

Matheus S. de Mattos ∗ Elizabeth R.S. Camargos ∗ , and David A. Lowther † 

∗ Federal University of Minas Gerais, Belo Horizonte, Brazil 

† McGill University, Montreal, Canada 

E-mail: jramirez@ufmg.br 

Abstract—This paper investigates the electroporation phenomenon in a single cell exposed to ultra short (μs) and high voltage (kV/m) 

electric pulses. The problem is addressed by two complementary approaches. First, numerical simulations based on an asymptotic 

approximation derived from the Smoluchowski theory are used to calculate the pore generation, growth and size evolution at the 

membrane of a spherical cell model, immersed in a suspension medium and consisting of cytoplasm and membrane. The numerical 

calculations are solved using the finite difference method. Second, an in vitro experiment with LLC-MK2 cells is carried out in which 

electroporation was monitored with molecules of propidium iodide. This part also comprehended the design and manufacturing of a 

portable electric pulse generator capable of providing rectangular pulses with amplitude of 1,000V and duration in the range of 1-μs 

to 100-μs. The pulse generator is composed of three modules: a high voltage dc source, a control module, and an energy storage and 

high voltage switching. The numerical simulations considered a 5-μm radius cell submitted to a 500kV/m rectangular electric pulse 

for 1-μs. The results indicate the formation of ∼3,500 pores at the cell membrane, most of them, ∼950, located at poles of the cell 

aligned to the applied electric pulse, with radii sizes varying from 0.5-nm to 13-nm. The in vitro experiment considered expositon 

of LLC-MK2 cells to pulses of 200V, 500V, and 700V, and 1-μs. Images from fluorescence microscopy exhibit the LLC-MK2 cells 

with intense red, a strong evidence of the electroporation. 

Index Terms—Electroporation, electric fields, finite difference method. 


Electroporation is the process of applying pulsed electric 

fields to biological cells to induce the formation of transient 

“pores” in the cell membrane. Depending on the magnitude 

and duration of the electric pulse, the membrane may recover 

to its original state (the pores reseal) -areversible process; 

otherwise, the cell dies - an irreversible process. This phenomenon 

was first reported by [1] and is well discussed in the 

contributions [2]- [4]. 

Earlier studies have focused on relatively low external fields, 

i.e. less than a kilovolt per centimeter, applied over time periods 

ranging from several tens of microseconds to milliseconds. 

Recently, the use of high electric fields (∼ 100kV/cm), or 

higher, with pulse durations in the nanosecond range has been 

employed and opened a new area of research in bioelectrics 

[5]. A controlled electroporation process can, therefore, be 

used to deliver substances to the cell cytoplasm in a wide range 

of applications, including gene therapy, drug delivery, nonthermal 

inactivation of micro-organisms and cancer treatment, 

see for instance [4]. 

From the practical point of view, controlling the electroporation 

process involves two complementary challenges. First, 

a comprehensive simulation analysis is required. The time 

dependent electric field, induced at the cell membrane by 

the external pulse, need to be obtained. It is this field that 

provides the dynamic driving force for the physical process. 

In addition, the dynamical evolution of the pores at the 

cell membrane under the influence of this field need to be 

adequately treated. Second, a detailed experimental laboratory 

test is necessary to confirm the simulation. This involves the 

building of an electric pulse generator in which the pulse 

width and electric field strength are controlled. Moreover, 

appropriate microscopy techniques are required to confirm the 

pore formation at the cell membrane. 

In terms of numerical simulations, most works that consider 

the dynamics aspects of the electroporation phenomenon are 

based on the Smoluchowski theory. Krassowska et al. [6]- [9] 

employ an asymptotic approximation of the Smoluchowski 

theory in a single cell model to determine the formation of 

pores in a spherical cell submitted to electric pulses of a 

few kV/m in the millisecond range. Schoenbach et al. [10]- 

[13], [5], use the full equations of Smoluchowski theory to 

establish the nucleation of pores in spherical cell models 

exposed to high intensity (thousands of kV/m) and ultra short 

(nano seconds) electric pulses. The approaches used by both 

groups yield acceptable results that are used in a wide range 

of applications, which in the first case are confirmed indirectly 

and in the second case are verified experimentally; however 

there is the need to address the electroporation phenomenon 

for electric pulses in the micro second range. 

This work investigates the electroporation phenomenon in 

a single cell when submitted to electric pulses of magnitude 

in the order of 1kV/mm and duration of 1-μs. The 

phenomenon is addressed by two complementary approaches, 

numerical simulations and an in vitro experiment. The Material 

and Methods is divided into three subsection. First, the 

mathematical modeling of electroporation describes how the 

numerical simulations can be used to assess the dynamics 

of the pore formation process, i.e. pore generation, growth 

and size-evolution at the cell membrane. This is based on

an asymptotic approximation based on the Smoluchowski 

theory and is solved using the finite difference method. The 

formulation is capable of providing the voltage induced across 

the cell membrane and important features for the practical 

application of electroporation, i.e. the number of pores and the 

distribution of pore radii as a functions of time and position 

on the cell membrane. A detailed description on how the 

numerical calculations are made is also given. Second, the 

electric pulse generator subsection discusses the theory used 

to design and build the generator. The first module is a high 

voltage d.c. source, the second is a control module and the 

third is responsible for the energy storage and high voltage 

switching. The generator is capable of providing retangular 

pulses with amplitude of 1,000V and duration in the range 

of 1μs to 100μs, with resting intervals of 10μs between 

the pulses. Third, the cell culture subsection describes the 

procedures used to prepare the LLC-MK2 cells for the in 

vitro experiment with molecules of propidium iodide. Finally, 

the Results presents the numerical analysis in a spherical cell 

of 5-μm radius, exposed to an electric pulse of 500-kV/m, 

duration of 1-μs; and the in vitro exposition of LLC-MK2 cells 

to electric pulses of of 200-kV/m, 500-kV/m, and 700-kV/m, 

duration of 1-μs, and fluorescence microscopy analysis. 

II. MATERIAL AND METHODS 

A. Mathematical Modeling of Electroporation 

For the mathematical description of the electroporation at 

the cell membrane, let us consider the model consisting of 

a cell in a suspension medium within the parallel plates of 

a cuvette, as indicated in Fig.1. For convenience, the cell 

is considered spherical and composed only by a membrane 

and cytoplasm, which are characterized by a conductivity and 

permittivity (σm, εm) and (σc, εc), respectively. The outer 

region, or suspension medium, is also characterized by its 

conductiviy and permittivity (σo, εo). 

 

 

 

 

σ , ε 

 

Fig. 1. Cell model - not to scale 

σ , ε 

 

 

 

 

σ, ε 


The dynamics of the electroporation process, i.e. behavior 

of pore generation, growth and size-evolution at the cell membrane, 

can be calculated using the continuum Smoluchowski 

theory [6], [11], with the following governing equation for the 

pore density distribution function n (r, t), 

 

∂n ∂ 

+ D − 

∂t ∂r 

n∂E 

 

1 ∂n 

− = S(r); (1) 

∂r kT ∂r 

where S(r) is the source, or pore formation term; D is a pore 

diffusion constant; r is the pore radius; T is the temperature; 

kB is the Boltzmann constant; and E(r) is the energy. This 

expression can be simplified by an asymptotic approximation 

based on [6]- [9]. This is discussed next. 

1) The formation of pores: It is assumed that the pores 

are hydrophilic and, thus, able to conduct current, and created 

with an initial radius r∗ at a rate determined by, 

 

dn 

= αe(Vm/Vep)2 1 − 

dt n 

 

(2) 

neqVm 

where n(t, θ) is the pore density and neq is the equilibrium 

pore density for a given transmembrane voltage Vm, 

neq(Vm) =n0e q(Vm/Vep)2 

2) The evolution of pore radii: The rate of change of the 

pore radii is given by 

drj 

dt = U(rj,Vm,σeff ),j =1, 2, ..., K (4) 

where U is the advection velocity given by 

U(r, Vm,σeff )= D 

 

kT 

V 2 mFmax 

1+rh/(r + rt) +4β 

r∗ 

r 

 

4 1 

r 

+ D 

kT [−2πγ +2πσeff r] 

The last term represents the effective tension of the membrane, 

σeff , which is a function of Ap, the combined area of all pores 

existing on the cell, 

σeff (Ap) =2σ ′ − 2σ′ − σ0 

(6) 

1 − Ap/A 

where σ0 is the tension of the membrane without pores, σ ′ is 

the energy per area of the hydrocarbon-water interface, Ap = 

k j=1 πr2 j 

(3) 

(5) 

, and A is the surface area of the cell. In this work 

it is assumed that changes of cell shape, area, and volume, 

can be ignored for microsecond pulses. 

3) The voltage in the cell membrane: The voltage in the 

cell membrane Vm can be calculated as the difference between 

Vi and Ve, i.e. the difference between the potential at the 

interfaces of the cell membrane, as indicated in the next Fig.2. 

Vm(t, θ) =Vi(t, R2,θ) − Ve(t, R1,θ) (7) 

The potential Vi at the inner interface (cytoplasm) and Ve 

at the outer interface (outer region) of the cell membrane can 

be obtained by two systems defined by Laplace’s equations, 

∇ 2 Vi =0 and ∇ 2 Ve =0 (8)

σ , ε 

 

σ , ε 

 

 

 

σ , ε 

Fig. 2. Cell membrane interface - not to scale 

The applied electric pulse is imposed as a boundary condition 

on the outer region Ve, 

Ve(t, r, θ) =−Ercos(θ) as r →∞ (9) 

where r is the distance from the center of the cell and θ 

is the polar angle measured with respect to the direction of 

the applied pulse E. In terms of numerical calculation, it is 

sufficient to set r ≥ 3R1, i.e. the outer region at least three 

times greater than the cell radius. This will be discussed in 

detail in the numerical calculation subsection. 

The other boundary conditions can be defined for t

TABLE I 

PARAMETERS FOR THE CELL NUMERICAL MODEL [9] 


σc(Sm −1 ) 3.0 × 10 −1 

σo(Sm −1 ) 3.0 × 10 −1 

σm(Sm −1 ) 3.0 × 10 −7 

εo(AsV −1 ) 6.4 × 10 −10 

εc(AsV −1 ) 6.4 × 10 −10 

εm(AsV −1 ) 4.4 × 10 −11 

R1(m) 5.0000 × 10 −6 

R2(m) 4.9995 × 10 −6 

CM (Fm −1 ) 8.8 × 10 −3 

n0(m −2 ) 1.5 × 10 9 

α(m −2 s −1 ) 1 × 10 9 

Vep(V ) 0.258 

r∗(m) 0.51 × 10 −9 

rm(m) 0.8 × 10 −9 

rh(m) 0.97 × 10 −9 

rt(m) 0.31 × 10 −9 

T (K) 310 

q 2.4606 

D(m 2 s −1 ) 5 × 10 −14 

γ(Jm −1 ) 1.8 × 10 −11 

β(J) 1.4 × 10 −19 

Fmax(NV −2 ) 0.7 × 10 −9 

σ ′ (Jm −2 ) 2 × 10 −2 

σ0(Jm −2 ) 1 × 10 −6 

Vrest(V ) −0.08 

It is capable of providing rectangular pulses with amplitude 

of 1,000V and duration in the range of 1μs to100μs with 

resting intervals of 10μs between the pulses. The modules are 

discussed in detail next. 

Fig. 4. Pulse generator modules. 

1) The high voltage d.c. source: The high voltage source 

module is based on [17] and uses a six capacitor doubler 

stage, as indicated in Fig.5. It is capable of converting a.c. 

voltage into d.c. voltage up to 2kV. It uses MUR1560 diodes 

and 330μF capacitors. A 1:1 transformer is used to provide 

insulation between the electrical grid and the module; in 

addition, a variable autotransformer is used to control the a.c. 

input and the d.c. output. 

2) The control module: The control circuit uses a 

PIC18F4550 microcontroller that operates at 12 Mips. The 

control driver is based in a push-pull amplifier and composed 

of a pair of Mosfets models ZVN2106a and ZVP2106a, both 


Fig. 5. High voltage dc source. 

capable of working with voltages up to 60V and currents 

up to 500mA, as indicated in Fig.6. The control driver is 

responsible for adjusting the control signal produced by the 

microcontroller to the ratings required to fast charge the 

capacitances of the switching circuit, which can be achieved 

by increasing the output voltage and the maximum current. 

Fig. 6. Driver circuit. 

3) High voltage storage and switching: The high voltage 

switching is based on a IGBT (IRGPS60B120KD) that operates 

with voltages up to 1,2kV and current pulses up to 240A. 

It has a 45ns rise time, 58ns fall time and 400ns turn off 

delay that provides pulses with amplitude and width in the 

range required in this work. The IGBT is connected in series 

to a 73μF capacitor and the load, as indicated in Fig.7. The 

capacitor charges through a 1kΩ resistor until it reaches the 

same voltage of the high voltage source. When the IGBT is 

activated, it inverts the voltage on the capacitor and a negative 

voltage, that equals the one that charged the capacitor, appears 

on the load. 

Fig. 7. Switching circuit.

4) The pulse generator: The pulse generator built is shown 

in Fig.8. For safety reasons, the high voltage source and the 

high voltage storage and switching circuit are inside an acrylic 

box; the control module is outside the box. 

Fig. 8. Pulse generator. 

C. In vitro experiment 

1) Cell culture procedures: LLC-MK2 (monkey kidney 

epithelial cell line) were maintained in DMEM (Dulbecco’s 

modified Eagle’s medium - Invitrogen) with 10% fetal bovine 

serum (FBS, Invitrogen), 1% penicillin-streptomycin (Invitrogen) 

and 1% glutamine (Sigma-Aldrich) at 37C under 5% 

CO2 atmosphere. Cell culture was kept at an optimal density 

through weekly passages. Briefly, LLC-MK2 cells were seeded 

until 70-90% confluent cell monolayer. To perform subculture, 

the cell culture medium was removed and the cells were 

rinsed twice with PBS -/-. Trypsin was used to remove 

adherent cells (0,5ml/25cm2 surface area). The cells were then 

resuspended in 4,5ml of fresh serum-containing medium for 

trypsin inactivation. Cell viability was assessed directly by 

Trypan Blue staining. Cultures were split 1:10 and placed in 

a new flask with DMEM 10%. 

2) Exposure to electroporation: LLC-MK2 cells were collected 

from culture media and suspended in HBS (Hepes 

Buffered Saline) at a concentration of 2, 6 × 106 cells/ml 

in rectangular electroporation cuvettes with 1-mm electrode 

separation. All the manipulations were done in sterile condition 

in a vertical laminar flow cabinet Veco. Electroporation 

was monitored with molecules of propidium iodide, 1,5-nm 

× 2-nm, (25 μg/ml, Sigma-Aldrich), a fluorochrome that is 

excluded from cells with intact membrane. 

3) Fluorescence microscopy: Aliquots of control and 

pulsed cells were placed into glass slides and observed in 

Axioplan 2 Zeiss fluorescence microscope (UV emission 630 

nm). 

III. RESULTS AND DISCUSSION 

A. Numerical simulations 

The simulations considered a 5-μm radius cell submitted to 

a 500kV/m rectangular electric pulse for 1-μs. The results for 

the total number of pores, number of pores at the polarized 

poles and the maximum radii of the pores are indicated at 


Figs.9-11. It can be seen from Figs.9-11 that the pore nucleation 

starts at approximately 0.8μs, when Vm is approximately 

1.25V, an reaches ∼3,500 pores at the cell membrane, most of 

them, ∼950, located at poles of the cell aligned to the applied 

electric pulse (θ =0 ◦ and θ = 180 ◦ ), with radii sizes varying 

from 0.5-nm to 13-nm. After the initial stage, the number of 

pores increases until around 1-μs, when the pulse ends. In the 

final stage, the radii of the pores decrease very fast but the 

number of pores stays stable for a longer period. This is in 

agreement with the literature, large pores tend to decay faster 

but the complete resealing of all pores take a longer period. 

Fig. 9. Total number of pores. 

Fig. 10. Number of pores in the 1st sector (θ =0 ◦ ) and in the 180th sector 

(θ = 180 ◦ ). 

B. Experiments 

A series of in vitro experiments were carried out with the 

LLC-MK2 cells following the methodology described in the 

previous section. After exposure to 1-μs pulse, the LLC-MK2 

cells exhibited intense red fluorescence mostly in the cell 

nuclei where propidium iodide binds to double-stranded DNA, 

as illustrated in Fig.13. This is a strong evidence that pores 

with radii of at least 2-nm (the size of the propidium iodide 

molecule) were created at the cell membrane. The numerical

Fig. 11. Maximum radii evolution. 

calculation predicts the creation of most pores at poles of 

the cell (θ=180 and θ=0) with radii sizes varying from 0.5nm 

to 13-nm; the fluoresce microscopy can only partially 

confirm this. Further tests with other microscopy techniques 

are required to confirm the region in the cell membrane where 

the pores are created, their sizes and how long they last. 

Fig. 12. Pulse of 500V × 1μs measured at the cuvette. 

Fig. 13. Propidium iodide influx into LLC-MK2 cell exposed to pulses of 

1-μs, 200V (a), 500V (b) and 700V (c). 


This paper considered the study of the electroporation 

phenomenon in a single cell, using numerical simulations and 


an in vitro experiment. The numerical simulations considered 

a5-μm radius cell submitted to a 500kV/m rectangular electric 

pulse for 1-μs, which was addressed using an asymptotic 

approximation based on the Smoluchowski theory and the 

finite difference method. The results indicate the formation 

of ∼3,500 pores at the cell membrane, most of them, ∼950, 

located at poles of the cell aligned to the applied electric pulse, 

with radii sizes varying from 0.5-nm to 13-nm. The in vitro 

experiment considered expositon of LLC-MK2 cells to electric 

pulses of 200kV/m, 500kV/m, and 700kV/m, and 1-μs. Images 

from fluorescence microscopy confirm the electroporation at 

the LLC-MK2 cells. The methodology employed is adequate 

to investigate the electroporation phenomenon in simple cells 

exposed to electric pulses of kV/m and in the μs range. 

V. ACKNOWLEDGMENT 

This work was supported by CNPq, Brazil, under 

Grants 306910/2006-3, 482185/2010-4, 507810/2010-4, 

504978/2010-1; by FAPEMIG, Brazil, under Grants Pronex: 

TEC 01075/09 and TEC-PPM-489/10; by CAPES, Brazil and 

DFAIT, Canada. 

REFERENCES 

[1] R. Stampfli, “Reversible electrical breakdown of the excitable membrane 

of a Ranvier node,” An. Acad. Brasil. Ciens., vol.30, pp.57-63, 1958. 

[2] T.Y. Tsong, “Electroporation of cell membrane,” Biophy. J., vol.60, 

pp.297-306, 1991. 

[3] J.C. Weaver and Y.A. Chizmadzhev, “Theory of electroporation: a review,” 

Bioelectroch. Bioenergetics, vol.41, pp.135-160, 1996. 

[4] T. Kotnik, P. Kramar, G. Pucihar, D. Miklavcic, M. Tarek, “Cell 

membrane electroporation. Part 1: The phenomenon,” IEEE Elec. Ins. 

Magazine, vol.28, pp.14-23, 2012. 

[5] R.P. Joshi and K.H. Schoenbach, “Bioelectric effects of intense ultrashort 

pulses,” Critic. Rev. in Biom. Eng., vol.38, pp.255-304, 2010. 

[6] J.C. Neu and W. Krassowska, “Asymptotic model of electroporation,” 

Phys. Rev. E, 59:3471-3482, 1999. 

[7] K.A. DeBruin and W. Krassowska, “Modeling electroporation in a single 

cell. I: Effects of field strength and rest potential,” Biophy. J., vol.77, 

pp.1213-1224, 1999. 

[8] K.C. Smith, J.C. Neu, W. Krassowska, “Model of creation and evolution 

of stable electropores for DNA delivery,” Biophy. J., vol.86, pp.2813- 

2826, 2004. 

[9] W. Krassowska and P.D. Filev, “Modeling electroporation in a single cell,” 

Biophy. J., vol.92, pp.404-417, 2007. 

[10] R.P. Joshi and K.H. Schoenbach, “Electroporation dynamics in biological 

cells subjected to ultrafast electrical pulses: a numerical simulation 

study,” Phys. Review E, vol.62, pp.1025-1033, 2000. 

[11] R.P. Joshi, Q. Hu, K.H. Schoenbach, “Dynamical modeling of cellular 

response to short duration, high intensity electric fields,” IEEE Trans. on 

Dielec. Elec. Ins., vol.10, pp.778-787, 2003. 

[12] K.H. Schoenbach, R.P. Joshi, J.F. Kolb, N. Chen, M. Stacey, P.F. 

Blackmore, E.S. Buescher, S.J. Beebe, “Ultrashort electrical pulses open a 

new gateway into biological cells,” Proceedings of IEEE, vol.92, pp.1122- 

1137, 2004. 

[13] Q. Hu, R.P. Joshi, K.H. Schoenbach, “Simulations of nanopore formation 

and phosphatidylserine externalization in lipid membrane subjected to a 

high intensity, ultrashort electric pulse,” Phys. Review E, vol.72, 031902, 

2005. 

[14] J. Mankowski and M. Kristiansen, “A review of short pulse generator 

technology,” IEEE Trans. on Plasma Science, vol.28, pp.102-108, 2000. 

[15] A. Chaney and R. Sundararajan, “Simple mosfet-based high-voltage 

nanosecond pulse circuit,” IEEE Trans. on Plasma Science, vol.32, 

pp.1919-1924, 2004. 

[16] J. R. Grenier and M. Kazerani, “Mosfet-based pulse power supply for 

bacterial transformation,” IEEE Trans. on Industry Application, vol.44, 

pp.25-31, 2008. 

[17] E. Kuffel, W.S. Zaengl, J. Kuffel. High Voltage Engineering Fundamentals. 

Second Edition, Butterworth Heinemann, Oxford, UK, 2000.


Anisotropic Model for the Numerical 

Computation of Magnetostriction in 

Grain-Oriented Electrical Steel Sheets 

M. Kaltenbacher∗ ,A.Volk † , and M. Ertl ‡ 

∗Institute of Mechanics and Mechatronics, Vienna University of Technology, Austria 

† Department of Sensor Technology, University of Erlangen-Nuremberg, Germany 

‡ Siemens Energy Sector, Nuremberg, Germany 

E-mail: manfred.kaltenbacher@tuwien.ac.at 

Abstract—We present a recently developed physical model for magnetostriction in transformer cores and its efficient 

numerical computation by applying the Finite Element (FE) method. Thereby, we fully take the anisotropic behavior of the 

material into account, both in the computation of the nonlinear electromagnetic field as well as the induced magnetostrictive 

strains. Numerical computations demonstrate the importance of modeling the anisotropy of grain oriented electrical steel 

sheets as used in electric transformers. Both the magnetic field along the joint regions, and furthermore the mechanical 

vibrations especially in thickness direction differ strongly as compared to computations with an isotropic material model. 

Index Terms—magnetostriction, finite element method, anisotropic material behavior, nonlinearity 


Magnetostrictive materials are widely used for actuator 

and sensor applications. However, often the magnetostrictive 

behavior of these alloys is an undesirable effect, as 

e.g. in electric machines and transformers, where it is one 

of the main sources for noise generation. Unfortunately, 

these materials exhibit nonlinear behavior for the magnetic 

properties as well as the mechanical characteristics 

leading to the well-known magnetic hysteresis loop and 

the magnetostrictive hysteresis loop (so-called butterfly 

curve), respectively (see, e.g., [1], [2], [3]). A quite 

important aspect – especially for grain-oriented electrical 

steel as used in transformers – is the anisotropic material 

behavior both concerning the magnetic properties as well 

as the induced mechanical strains [4]. 

The modeling of magnetostrictive effects is a topic of 

intensive research. Among the huge amount of publications 

one can find three main approaches. The first one, 

which is widely used, is based on introducing a magnetostrictive 

strain tensor, where the entries depend on the 

magnetic induction (see, e.g., [5], [1]). Thereby, these 

additional strains result in mechanical forces modeled as 

a right hand side term in the partial differential equation 

(PDE) for mechanics. In a second approach, a free 

energy as a tensor function depending on the mechanical 

strain and magnetic induction is used (see, e.g., [6], [7]). 

Thereby, a fully coupled constitutive relation between 

mechanical and magnetic quantities is achieved. The last 

approach is based on a thermodynamic consistent model, 

where the mechanical strain and magnetic induction is 

decomposed in a reversible and an irreversible part [2], 

[3]. Furthermore, the full constitutive model is based on a 

free energy function. Whereas in [2] the irreversible part 

is modeled by a switching criterion using inner variables, 

[3] uses hysteresis operators. Common to all models 

is the current restriction to isotropic and / or uniaxial 

behavior. 

Our goal is the precise investigation of the magnetic 

field and resulting mechanical vibrations caused by magnetostriction 

along the joint regions of electric transformers. 

Therefore, we cannot apply any homogenization 

technique and fully resolve each individual steel sheet. 

This is clearly not possible for a whole transformer 

core, and so we restrict our investigation to some few 

steel sheets. To reduce the complexity, we choose an 

ansatz, in which we neglect the reaction of the mechanical 

stresses and strains on the magnetic properties and 

therefore decouple the computation of the magnetic and 

mechanical field. By help of an Epstein frame and a SST 

(Single Sheet Tester), we measure the magnetic as well 

as the mechanical hysteresis curves of the grain-oriented 

electrical steel sheets with different orientations (w.r.t the 

rolling direction). From these curves we then extract for 

each orientation the corresponding commutation curve, 

so that the hysteretic behavior is simplified to a nonlinear 

one. This approach is then applied to a stack of six 

electrical steel sheets with a 90 o joint region, excited 

by two current loaded coils. We compare this anisotropic 

model to an isotopic one, where the nonlinear magnetic 

and mechanical material parameter are just used from the 

rolling direction. 

The rest of the paper is organized as follows. In Sec. 

II we describe our physical model and its in-cooperation 

into the magnetic and mechanical PDE as well as their Finite 

Element (FE) formulation. The measurement setups, 

which provide us the nonlinear curves, are discussed in 

Sec. III. In Sec. IV the numerical results are presented, 

demonstrating the importance of taking anisotropy for 

grain-oriented electrical steel sheets as used in transformers 

into account. Finally, Sec. V summarizes our 

achievements.

II. PHYSICAL MODELING AND FE DISCRETIZATION 

Magnetostrictive materials are characterized by the 

magnetic hysteresis between the magnetic induction B 

and magnetic field intensity H as well as the mechanical 

hysteresis between the mechanical strain S and magnetic 

induction B (see Fig. 1). 

Fig. 1. Magnetic and mechanical hysteresis (butterfly curve). 

According to a thermodynamically consistent model, 

we decompose the physical quantities magnetic induction 

and mechanical strain into a reversible and an irreversible 

part1 (indicated by the superscripts r and i, respectively) 

S = S r + S i , B = B r + B i . (1) 

To allow for the history of the driving magnetic field 

intensity, the irreversible magnetic induction Bi is set to 

be equal to the magnetization M, which is modeled, e.g., 

by a Preisach hysteresis operator [3] 

B i = M = H[H] eM . (2) 

The irreversible strain can be, e.g., expressed by the 

following polynomial ansatz [3] 

i 

S = 3 

2 (β1 ·H[H]+β2 · (H[H]) 2 + ··· 

+βn · (H[H]) n 

) eM ⊗ e t M − 1 

3 [I] 

 

, (3) 

while the parameters β1, ···,βn need to be fitted to 

measurement data and [I] denotes the identity tensor. 

Now, magnetostriction is a property of ferromagnetic 

materials and can be described as a coupling between the 

mechanical and the magnetic field. This relation is described 

by the well-known magnetostrictive constitutive 

equations modeling the linear coupling of the magnetic 

and the mechanical deformation [2] 

σ = [c H ]S r − [e] t H (4) 

B r = [e]S r +[μ S ]H . (5) 

In (4), (5) σ denotes the Cauchy stress tensor in Voigt notation, 

[cH ] the tensor of mechanical moduli (at constant 

magnetic field intensity), [e] the piezomagnetic coupling 

tensor and [μS ] the tensor of magnetic permeability (at 

constant mechanical strain). 

By using these constitutive relations, we have presented 

in [3] a formulation based on the magnetic 

1 With [S] we denote the tensor of mechanical strain and with S 

the algebraic vector containing the three normal and three shear strains 

according to Voigt notation. 


scalar potential, and in [8] have even extended it for 

the magnetic vector potential to also take eddy current 

effects into account. However, both models are currently 

restricted to scalar hysteresis operators, and do not take 

into account the anisotropic behavior, which is a crucial 

point for grain-oriented electrical steel used in transformer 

cores [4]. Furthermore, both models are quite 

expensive concerning computational time. Therefore, we 

have developed a dedicated physical model for grainoriented 

electrical steel. In doing so, we first assume that 

the entries of the piezomagnetic coupling tensor [e] are 

small, and we are allowed to neglect this coupling in (4), 

(5). Next the anisotropic and nonlinear magnetic behavior 

of the steel sheets is modeled by its vector relation 

between the magnetic induction B and field intensity H 

B = B (H) =Bϕ(H)eB ; eB = B 

. (6) 

B 

Here, we compute the unit vector eB and evaluate the 

magnetic commutation curve Bϕ for which the orientation 

fits best with eB. Therefore, the defining partial 

differential equation (PDE) for the magnetic field reads 

as 

γ ∂A 

∂t −∇×ν(Bϕ)∇×A = Ji (7) 

with A the magnetic vector potential, Ji the impressed 

current density, ν the magnetic reluctivity depending on 

Bϕ (see (6)) and γ the electric conductivity. 

The PDE for mechanics is given by 

ρ ∂2u ∂t2 −Btσ =0 (8) 

with u the mechanical displacement, ρ the density, and 

B = ∇s the differential operator. As in (3), we assume the 

conservation of volume for the irreversible strain. However, 

we now model instead of the hysteretic behavior 

a nonlinear, anisotropic behavior, and denote it by [Sm ] 

(magnetostrictive induced strain tensor), which computes 

as follows 

[S m ]= 3 

2 

 

eB × e t B − 1 

3 I 

 

S m ϕ (B) . (9) 

Here, we compute the direction of B and evaluate the 

magnetostrictive commutation curve S m ϕ (B) for which 

the orientation fits best with eB. Now, we can express 

the reversible mechanical strain S r by the difference of 

the total strain S = Bu and the irreversible (magnetostrictive) 

strain S i = S m via 

S r = Bu − S m . (10) 

This relation in combination with (4) by neglecting [e] 

results for (8) into 

ρ ∂2 u 

∂t 2 −Bt [c H ]Bu = −B t [c H ]S m . (11) 

The Finite Element (FE) formulation of (7) and (11) 

is straight forward. For (7) we use edge finite elements 

and solve the arising algebraic system of equations by 

an efficient Newton scheme utilizing a two level solver

[9]. For (11) we apply nodal finite elements (for details, 

see e.g., [10]). 

Summarizing, the developed magnetostrictive model 

has the following features: 

• Decoupling of magnetic and mechanical PDEs; so 

both PDEs can be solved separately with optimal 

conditions. 

• Anisotropy and eddy currents are taken into account 

• No hysteresis considered; instead it uses commutation 

curves computed from measured hysteresis 

curves. 

• Change of magnetic properties due to the mechanical 

field within a working point is neglected 

(working point can be determined by pre-stressing 

of measured samples). 

III. MEASUREMENT SETUPS 

First of all, to obtain reliable measurement data for the 

magnetic behavior, we have constructed an Epstein frame 

according to IEC 60404-2 (see Fig. 2). The 25 cm Epstein 

Steel sheets (overlap at the corners) 

Coil to compensate flux in air 

Excitation and 

measurement 

coils 

Fig. 2. 25 cm Epstein frame for measuring the magnetic properties 

of grain-oriented electrical steel sheets. 

apparatus consists of 4 coils with primary windings, 

secondary windings, a compensation coil and the material 

sample as core. The sheets are stratified in stripes. The 

measurement setup represents in this way a transducer, 

whose characteristics are specified. The primary outer 

windings are used to magnetize the material and the 

secondary inner windings are needed for magnetic flux 

density determination over the induced voltage. We have 

performed measurements for steel sheets, which have 

been cut out at different angles according to the rolling 

direction. Thereby, for each stack of steel sheets, we 

have measured the outer and all inner hysteresis loops, as 

demonstrated in Fig. 3. Out of all the hysteresis loops, 

we compute for each angle a commutation curve (see 

Fig. 4), which we then use in our numerical computation 

for the magnetic field. 

To measure the mechanical hysteresis of the electrical 

steel sheets a second measurement setup was constructed 

on the basis of a Single Sheet Tester (SST) as displayed 

in Fig. 5. This extended setup also captures 


B (T) 

H (kA/m) 

Angle 0 o 

Angle 15 o 

Angle 30 o 

Angle 45 o 

Angle 60 o 

Angle 75 o 

Angle 90 o 

Fig. 3. Magnetic hysteresis curves for grain-oriented electrical steel 

sheets being cut out at different angles according to the rolling direction 

(0 o corresponds to the rolling direction). 

B ( (T) ) 

H (kA/m) 

Angle 0 o 

Angle 15 o 

Angle 30 o 

Angle 45 o 

Angle 60 o 

Angle 75 o 

Angle 90 o 

Fig. 4. Nonlinear BH curves for different angles (0 o corresponds to 

the rolling direction). 

the magnetic induction as well as the magnetic field 

intensity. However we did not use the SST to determine 

magnetic hysteresis since it has to be calibrated to a 

certified setup to obtain reliable measurement results. To 

capture the mechanical hysteresis the SST was extended 

by a lifting mechanism to unload the sample sheet to 

ensure its stress-less vibration. The mechanical vibration 

due to magnetic excitation of the SST is measured by 

Ferrite core 

Excitation n and 

Steel sheet measurement ment coil 

(material under test) 

Laser- 

vibrometer 

Fig. 5. Single sheet tester as used to obtain the mechanical hysteresis 

(principle and manufactured setup).

a laser vibrometer, which compared to strain gauges 

provides high accuracy without electromagnetic crosssensitivity 

and is contact-free. The measurement of the 

mechanical strain as a function of the magnetic field 

results in the magnetostrictive hysteresis loop (so-called 

butterfly curve). Additionally the extended SST permits 

pre-stressing of the steel sheets in order to capture the 

reaction on the magnetic properties which corresponds to 

a working point that is used in the simulation. To consider 

anisotropy, again a series of measurement is performed 

with different electrical steel sheets which have been cut 

out with varying cutting angles with respect to the grain 

orientation of the steel. As for the magnetic hysteresis, we 

also convert the mechanical hysteresis in a single commutation 

curve, which leads to angle-dependent nonlinear 

magnetostriction curves, as displayed in Fig. 6. 

S (μm/m) 

Angle 0 o 

Angle 15 o 

Angle 30 o 

Angle 45 o 

Angle 60 o 

Angle 75 o 

Angle 90 o 

B (T) 

Fig. 6. Nonlinear SB curves for different angles (0 o corresponds to 

the rolling direction). 


For the numerical investigation, we choose a setup of 

six stacked electrical steel sheets with a 90 degree joint 

and an excitation coil along each yoke as displayed in 

Fig. 7. We model just a quarter symmetry by applying 

Steel sheets 

Excitation coils 

Zoomed and scaled 

in thickness direction 

Fig. 7. Computational model: quarter symmetry is considered. 

appropriate boundary conditions at the symmetry planes. 

Our main goal is to study the difference between an 

isotropic and anisotropic magnetostrictive computation. 


Thereby, we choose for the isotropic computation the 

measured material curves along the rolling direction 

(angle of zero degree), whereas for the anisotropic computation 

we use all measured material curves (see Fig. 4 

and 6). 

In a first step, we compute the magnetic field and 

compare the flux lines at the joints. Figure 8 displays the 

flux lines for the isotropic and Fig. 9 for the anisotropic 

case at the time step of maximal magnetic induction 

(about 1.7 T). We display the flux lines just for the two 

Fig. 8. Magnetic flux lines for the two upper steel sheets in case 

of isotropic computation. For better visualization we have scaled the 

thickness direction by a factor of ten. 

last layers and zoom into the joint region. Comparing 

the results, one can clearly see the difference. For the 

Fig. 9. Magnetic flux lines for the two upper steel sheets in case 

of anisotropic computation. For better visualization we have scaled the 

thickness direction by a factor of ten. 

isotropic case, the amplitude of the magnetic induction 

immediately drops to a low one at the beginning of the 

joint due to the increased effective cross section when 

turning the flux direction in the rectangular joint region. 

Since the magnetic material properties are homogeneous 

and independent of direction, the change of the magnetic 

flux direction itself is continuously across the joint region. 

The transition of the magnetic flux between the 

two vertical stacked steel sheets is mainly limited when 

entering and leaving the joint region. Accordingly the 

magnetic flux density reaches its full value just at the 

end of the joint when entering the opposite yoke. 

In the anisotropic case, the guiding effect of the 

preferred magnetic direction in the grain orientation of 

the electrical sheet keeps the amplitude and direction of 

the magnetic flux for some distance in the joint region.

In the area of the central diagonal of the joint region (at 

45 o ), the reduced magnetic permeability perpendicular 

to the grain orientation forces the magnetic flux to a 

vertical transition into neighbouring steel sheets. This 

x-Displacement (nm) 

y-Displacement (nm) 

10 

0 

-25 

20 

0 

-50 

0 20 40 60 

Time (ms) 

0 20 40 60 

Time (ms) 

z-Displacement (nm) 

2.5 

0 

-2.5 

Evaluation point 

0 20 40 60 

Time (ms) 

Isotropic case 

Anisotropic case 

Fig. 10. Mechanical displacement at an observation point along the 

yoke. 

behavior of the magnetic field has a strong impact on the 

mechanical vibrations. In a second step we use the computed 

magnetic induction and calculate the mechanical 

deformation according to the additional magnetostrictive 

strain. In Figs. 10 and 11 we display all three components 

of the mechanical displacement over time at 

two different observation points. In general, we observe 

that the displacement in plane direction (x− and y− 

displacement) show almost no difference. However, the 

displacement in thickness direction (z− displacement) is 

quite different both concerning amplitude and frequency 

content. Especially at the joint region the amplitude of the 

mechanical vibration is a factor of about 1000 larger in 

the anisotropic case as in the isotropic case. Furthermore, 

we can state that the computation for the isotropic 

material model exhibits mainly the 100 Hz component 

(current excitation is at 50 Hz). In the anisotropic case 

higher harmonics are predominant. The related frequency 

spectrum in vibration and noise is typical what can be 

measured at real transformers. 

V. CONCLUSION AND OUTLOOK 

We have presented a magnetostrictive constitutive 

model which fully takes the anisotropy of grain-oriented 

electrical steel sheets as used in electrical transformers 

into account. The model itself is simplified in this 

sense that the magnetic as well as mechanical hysteretic 

behavior is reduced to a nonlinear one by computing 

commutation curves out of the corresponding hysteresis 

measurements. Furthermore, we neglect the impact of 

the mechanical field on the magnetic properties within a 

working point, which can be determined by pre-stressing 

the measured sample sheets. However, the model needs 

measurements provided by an Epstein frame and a SST 


x-Displacement (nm) 

y- Displacement (nm) 

0 

-45 

0 

0 20 40 60 

Time (ms) 

-45 

0 20 40 60 

Time (ms) 

z - Displacement (nm) 

8 

0 

-6 

Evaluation point 

Scaled by 1000 

0 20 40 60 

Time (ms) 

Isotropic case 

Anisotropic case 

Fig. 11. Mechanical displacement at an observation point at the joint 

region. 

(Single Sheet Tester). The computations show strong differences 

both in the magnetic field as well as mechanical 

vibrations when comparing this anisotropic model to an 

isotropic one, which just uses measured curves in rolling 

direction of the steel sheets. 

Currently we are working on an experimental validation 

setup, where we can study different joint techniques, 

especially step-lap joints. 

REFERENCES 

[1] L. Vandevelde, J. A. Melkebeek. Modeling of Magnetoelastic 

Material. IEEE Trans. on Magnetics, 38(2), 2002. 

[2] K.Linnemann, S. Klinkel, W. Wagner. A constitutive model for 

magnetostrictive and piezoelectric materials. International Journal 

of Solids and Structures 46, 2009. 

[3] M. Kaltenbacher, M. Meiler, M. Ertl. Physical modeling and 

numerical computation of magnetostriction. Compel, 28(4), 2009. 

[4] B. Weiser, H. Pfützner, J. Anger. Relevance of Magnetostriction 

and Forces for the Generation of Audible Noise of Transformer 

Cores IEEE Trans. on Magnetics, 36(5), 2000. 

[5] K. Delaere, W. Heylen, K. Hameyer, R. Belmans. Local Magnetostriction 

Forces for Finite Element Analysis. IEEE Trans. on 

Magnetics, 36(5), 2000. 

[6] A. Dorfmann and R. W. Ogden. Magneto-elastic modeling of 

elastomers. Eur. J. Mechanics and Solids, 22, 2003. 

[7] K. Fonteyn, A. Belahcen, R. Kouhia, P. Rasilo, A. Arkkio. FEM 

for Directly Coupled Magneto-Mechanical Phenomena in Electrical 

Machines. IEEE Trans. on Magnetics, 46(8), 2010. 

[8] A. Volk, M. Kaltenbacher, A. Hauck, M. Ertl, R. Lerch. Finite Element 

Scheme based on Magnetic Vector Potential and Mechanical 

Displacement for Modeling Magnetostriction. Proceedings of the 

8th International Conference on Computation in Electromagnetics 

CEM, 2011. 

[9] A. Hauck, M. Ertl, J. Schöberl, M. Kaltenbacher. Accurate Simulation 

of Transformer Step-Lap Joints using Anisotropic Higher 

Order FEM. 15 th IGTE Symposium, Graz, Austria, 2012. 

[10] M. Kaltenbacher. Numerical Simulation of Mechatronic Sensors 

and Actuators. Springer, 2nd edition, 2007.


Analytic Approximation Solution for the 

Schwarz-Christoffel Parameter Problem 

Norbert Eidenberger∗ and Bernhard G. Zagar∗ ∗Institute for Measurement Technology, Altenberger Strasse 69, A-4040 Linz, Austria 

E-mail: norbert.eidenberger@jku.at 

Abstract—We present a novel analytic approximation method for the Schwarz-Christoffel parameter problem based on 

linearization. The modeling requirements for successful linearization are discussed. The linearization introduces a mapping 

error which can be virtually eliminated by applying an optimization method. Thus, the proposed method yields conformal 

mapping functions which can provide solutions for inverse problems and are suited for sensitivity analyses. 

Index Terms—Schwarz-Christoffel parameter problem, Schwarz-Christoffel transform, conformal mapping, potential 

problem. 


Conformal mapping methods provide useful tools for 

the analysis of many physical phenomena. In particular, 

these methods can be utilized to solve two dimensional 

potential problems which appear e. g. in electromagnetics, 

fluid dynamics, or heat transfer [1], [2]. The 

general idea consists in transforming a potential problem 

bounded by a complicated geometry to a simpler one, 

for which the solution can be computed more easily. 

The transformation is performed by a conformal mapping 

function. Subsequently, this function also transforms the 

solution of the simpler problem to the complicated one 

which provides the solution of the original problem. 

Some recent examples for the application of conformal 

mapping methods are [3], [4], and [5]. 

For the purpose of conformal mapping the coordinates 

of two dimensional problems are interpreted as the real 

and imaginary parts of complex numbers. Thus, the 

mapping functions represent complex valued functions 

in complex variables [6]. Different methods are available 

for the construction of suitable mapping functions [2]. 

One often utilized method, the Schwarz-Christoffel transform 

(SCT), constructs mapping functions for polygonal 

geometries. Many relevant technical problems involve 

polygon shaped boundaries therefore the SCT plays an 

important role in many applications. 

In order to employ the SCT, its problem dependant 

parameters need to be computed. It is not possible to 

compute the parameters for polygons with more than 

three corners analytically, although a unique solution for 

the SCT parameters exists. This constitutes the so-called 

SCT parameter problem [7]. Nowadays numerical methods 

are routinely employed to solve the SCT parameter 

problem [8]. However, numerical methods yield solutions 

which are disconnected from the original problem geometry. 

This prevents further analysis with respect to the 

geometric parameters of the original problem. 

In this paper we propose a solution method for the SCT 

parameter problem based on a series expansion of the 

SCT base function (1). The method yields an approximate 

analytic solution for the SCT parameters containing the 

geometry parameters. Due to the approximation error 

the resulting mapping function produces mapping errors. 

We show that the mapping errors can be eliminated by 

prewarping the geometric parameters appropriately. This 

is achieved through an optimization method which minimizes 

the mapping error. The advantage of the proposed 

method over the standard numerical solution consists in 

the presence of the geometry parameters in the mapping 

function which permits further analysis of the problem, 

e. g. sensitivity analyses or solving inverse problems. 

This paper consists of three main parts. The first part 

gives a short introduction to the SCT together with 

the corresponding parameter problem. The second part 

describes the approximation method for the solution of 

the SCT parameter problem. The third part presents the 

minimization method which eliminates the mapping error. 

Finally, the conclusion sums up the properties of the 

proposed method and highlights its potential advantages 

and applications. 

II. THE SCHWARZ-CHRISTOFFEL TRANSFORM 

The SCT represents a widely utilized method for 

constructing conformal mapping functions. The SCT base 

equation, 

 

n 

z = f(w) =A (w − wi) αi π −1 

dw + B, (1) 

i=1 

maps the upper half of the image (w-) plane to the 

inside of a polygon in the object (z-) plane [9] which 

is illustrated in Fig. 1. It contains several unknown 

parameters. Parameter A represents a scaling and rotation 

factor, parameter B represents a translation, and the 

parameters wi represent the corner coordinates in the 

w-plane with the known parameters αi representing the 

corresponding interior angles. 

The unknown parameters A, B and wi are computed 

by comparing the polygon corners coordinates in both 

planes via the relation z = f(w). The resulting number of 

equations equals the number of unknowns, which means 

that a unique solution exists. However, for polygons with 

more than three corners the integral in (1) yields special


Fig. 1. Example setup for the Schwarz-Christoffel transform showing the geometric parameters of a rectangle in the z-plane and its image in the 

w-plane. 

functions for which no inverse functions exist. This 

prohibits the analytic computation of the SCT parameters 

even though it is known that a unique solution exists. This 

constitutes the SC parameter problem [7]. 

Nowadays, the parameter problem is usually solved 

numerically. A thorough discussion of numerical solution 

methods for the SC parameter problem is presented in 

[8]. The authors of [8] also provide a Matlab toolbox 

[10] which permits an easy application of the SCT. 

For quadrilaterals such as rectangles the SCT produces 

mapping functions which consist of elliptic functions. 

In these cases the elliptic modulus can be utilized to 

efficiently compute the parameters numerically [7]. An 

application example for this approach is presented in 

[11]. 

III. ANALYTIC APPROXIMATION OF THE PARAMETER 

PROBLEM 

The disadvantage of purely numerical solutions of the 

SC parameter problem consists in the missing relation to 

the original geometry. This prevents subsequent analyses 

of problems with respect to their geometry. Preserving 

this relation requires an analytic approach which is 

developed below. 

The proposed method consists of several steps. The 

first step consists in constructing the SCT for the problem 

at hand. The evaluation of the integral in (1) then yields 

a mapping function for which the SC parameters need to 

be computed. In order to compute them analytically, the 

mapping function is linearized. Then the SC parameter 

problem is solved for the linearized mapping function. 

The results are inserted back into the original nonlinear 

mapping function which then contains geometry parameters 

of the original problem. 

Unfortunately, the procedure is not quite that straight 

forward. Several problems which may occur during linearization 

need to be addressed, before the method can 

be applied successfully. 

A. Method Development 

An analysis of (1) shows that the SCT lends itself to 

Taylor series expansion. Equation (1) can be rewritten as 

 

z = A g(w)dw + B (2) 

where 

g(w) = 

n 

i=1 

(w − wi) α i 

π −1 

which represents the transformation core of the SCT 

defining the general polygon shape. Equation (2) indicates, 

that beginning with the first order term, the Taylor 

series consists only of the transformation core g(w) and 

its derivatives. Because g(w) represents a product, this 

means that the series expansion consists mainly of simple 

functions. 

Expanding (2) as a Taylor series yields 

(3) 

z = f(w0)+Ag(w0)(w − w0)+O(w 2 ) (4) 

where O(w2 ) represents the higher order terms of the 

series. In order to be able to solve the parameter problem, 

the SC parameters must not appear within non-invertible 

functions. The non-invertible special functions contained 

in f(w) disappear in the first and higher order terms 

of the series. However, there remains a special function 

within the zeroth order term 

 

 

f(w0) =A g(w)dw 

+ B. (5) 

w=w0 

The special function vanishes if the result of the integral 

at w0 equals zero. Parameter B represents a translation 

of the mapped polygon in the z-plane, thus it can be set 

to B =0without loss of generality. This corresponds 

to mapping the origins of the w- and z-plane onto each 

other, so the Taylor series expansion is centered at w0 = 

0 and f(w0) =0. If a translation of the mapping result is 

truly required, this can be achieved by a simple additional 

mapping function. 

Truncating (4) after the first order term and incorporating 

the above considerations yields 

z = Ag(w0)(w − w0) (6) 

which represents a mapping function linearized with 

respect to the image coordinates w. Note, that (6) usually 

will not be linear with respect the geometric parameters. 

The condition developed in the previous step requires 

that the origins of the planes are mapped onto each 

other. This can only be guaranteed if the coordinates of 

a point are known in both planes. This is the case for the 

polygon corners which are present in the transformation 

core g(w).


Fig. 2. Example for a symmetric polygon which features a point besides the corners for which its coordinates are known in both planes. 

The transformation core consists of a product of terms 

which contain the coordinates of the polygon corners 

in the w-plane. However, for a corner at w0 = 0 the 

transformation core evaluates either to zero or infinity, 

depending on the angle αi in the exponent corresponding 

to the corner. 

g(0) = 0 if α>π (7) 

g(0) = ∞ if α

Fig. 3. Illustration of the mapping error introduced into the mapping 

function by the linearization. 

In this case the mapping error can be defined as the 

distances between the ideal and the actual mapped corner 

positions in the z-plane 

i=1 

Δz = f 

 

w, 

d − z, (13) 

where the components of w represent the coordinates 

wi of the polygon corners in the w-plane, and z and 

Δz consist of the corresponding ideal polygon corner 

positions and mapping errors in the z-plane. 

Equation (13) is not suited as an objective function 

for minimizing the mapping error because the sign of 

the mapping error may change. In order to avoid this, 

the sum of the square of the real and imaginary part of 

the mapping errors is utilized as the objective function 

 

n 

 

min Re(Δzi) 2 +Im(Δzi) 2 

, (14) 

where n represents the number of polygon corners. 

Equation (14) can be reformulated as 

min Δz T Δz ∗ , (15) 

where Δz ∗ represents the vector containing the complex 

conjugate mapping error. The objective function in (15) 

forms a convex function with a global minimum at 0 

which is known to exist. In addition, (15) consists of a 

sum of squares. For this type of functions exist specialized 

optimization algorithms [12], [13], which ensures 

that a solution can be computed easily. 

V. CONCLUSION 

In this paper we have presented a method for computing 

an analytic approximation solution of the SC 

parameter problem. The proposed method takes advantage 

of the structure of the SCT base equation and 

shows that a linearization of the problem is possible 

under certain conditions. These conditions are defined 

and their consequences regarding the modeling process 

are discussed. 

The resulting conformal mapping function contains 

approximation errors. We presented a formulation of the 

corresponding mapping error which permits its minimization 

to arbitrarily small dimensions by varying the 

geometric parameters of the original polygon. Thus, the 

proposed method yields a conformal mapping function 

which produces the desired map. 


The advantage of the proposed analytic over the conventional 

numeric approximation consists in the presence 

of the geometry parameters in the mapping function. 

Together with the minimization procedure this permits 

• the solution of inverse problems e.g. in capacitive 

sensing applications, 

• the sensitivity analysis of sensor setups with respect 

to their geometry [14]. 


The authors gratefully acknowledge the partial financial 

support for the work presented in this paper by the 

Austrian Research Promotion Agency and the Austrian 

COMET program supporting the Austrian Center of 

Competence in Mechatronics (ACCM). 

REFERENCES 

[1] P. M. Morse and H. Feshbach, Methods of theoretical physics. 

McGraw-Hill, 1953, vol. 1. 

[2] R. Schinzinger and A. Laura, Conformal Mapping: Methods and 

Applications. Elsevier, 1991. 

[3] A. Verhoff, “Generalized poisson integral formula applied to 

potential flow solutions for free and confined jets with secondary 

flow,” Computers & Fluids, vol. 54, pp. 18–38, 2012. 

[4] A. J. Davidson and N. J. Mottram, “Conformal mapping techniques 

for the modelling of liquid crystal devices,” European 

Journal of Applied Mathematics, vol. 23, no. 01, pp. 99–119, 

2012. 

[5] M. Schwarz, T. Holtij, A. Kloes, and B. Iñíguez, “Analytical 

compact modeling framework for the 2D electrostatics in lightly 

doped double-gate mosfets,” Solid-State Electronics, vol. 69, pp. 

72–84, 2012. 

[6] P. Henrici, Applied and Computational Complex Analysis. John 

Wiley & Sons, 1974, vol. 1. 

[7] ——, Applied and Computational Complex Analysis. John Wiley 

& Sons, 1986, vol. 3. 

[8] T. A. Driscoll and L. N. Trefethen, Schwarz–Christoffel Mapping, 

P. Ciarlet, A. Iserles, R. Kohn, and M. Wright, Eds. Cambridge 

University Press, 2002. 

[9] V. I. Smirnov, Lehrbuch der höheren Mathematik, 14th ed. Harri 

Deutsch, 1995. 

[10] T. Driscoll. (2012, Sep.) Schwarz-Christoffel toolbox 

for MATLAB. University of Delaware, Department 

of Mathematical Sciences. [Online]. Available: 

http://www.math.udel.edu/ driscoll/software/SC/ 

[11] R. Igreja and C. Dias, “Extension to the analytical model of the 

interdigital electrodes capacitance for a multi-layered structure,” 

Sensors and Actuators A: Physical, vol. 172, no. 2, pp. 392–399, 

2011. 

[12] R. Fletcher, Practical methods of optimization, 2nd ed. Wiley, 

2000. 

[13] P. E. Gill, W. Murray, and M. H. Wright, Practical optimization. 

Academic Press, 1981. 

[14] N. Eidenberger and B. G. Zagar, “Sensitivity of capacitance 

sensors for quality control in blade production,” in ICST 2011, 

Dec. 2011.


Additional Eddy Current Losses in Induction 

Machines Due to Interlaminar Short Circuits 

P. Handgruber∗ , A. Stermecki∗ ,O.Bíró∗ , and G. Ofner † 

∗Institute for Fundamentals and Theory in Electrical Engineering (IGTE), 

Christian Doppler Laboratory for Multiphysical Simulation, Analysis and Design of Electrical Machines, 

Inffeldgasse 18/I, A-8010 Graz, Austria 

† ELIN Motoren GmbH, Elin-Motoren-Straße 1, A-8160 Preding/Weiz, Austria 

E-mail: paul.handgruber@tugraz.at 

Abstract—A novel three-dimensional eddy current model to account for the additional losses caused by interlaminar short 

circuits is presented and applied to the loss estimation of an induction machine. The method is based on a single sheet model 

with appropriate boundary conditions on the interlaminar contact surface avoiding cumbersome full three-dimensional 

models with multiple short circuited laminations. The results of the single sheet model without short circuits are compared 

to measured no-load iron loss data. The interlaminar model is validated by means of full models comprising several 

interconnected sheets and used for the quantification of extra losses caused by conductive joints and shearing burrs. It has 

been found that particularly burrs occurring on the tooth edges lead to a significant loss increase. 

Index Terms—AC-machines, eddy currents, electric machines, electromagnetic modeling, finite element methods, magnetic 

losses 


Active iron parts in electrical machines are commonly 

built of thin steel laminations. In an ideally assembled 

core, the individual laminates are isolated from each 

other in order to minimize the effects of eddy currents. 

During the cutting process, mechanical deformations lead 

to microscopic shearing burrs on the cutting edges. This 

edge burrs can break down the insulation resulting in 

conductive connections between the stacked sheets. If 

the burr-induced short circuits cover several laminations, 

high currents begin to circulate leading to a significant 

loss increase and hence to local overheatings [1]. 

The core stacks are frequently held together by conductive 

joints, such as bolts, welds or clamping bars. 

In most cases, these fixations as well as the shaft and 

parts of the frame are mounted uninsulated on the core, 

short-circuiting a large number of laminations. Further 

interlaminar contacts are induced by small insulation 

faults on the lamination surfaces inside the core middle. 

However, the probability of such inner short circuits is 

very low and stochastic [2]. Therefore, their effects are 

not considered in this work. 

Up to now, the complex problem of interlaminar short 

circuits has been chiefly treated by statistical and/or empirical 

methods accompanied by vast measurement series 

[3], [4]. Attempts to quantify the resulting additional 

losses are mostly based on analytical approaches. For 

instance, in [5], [6] and [7] artificial burrs have been applied 

in a controlled manner to a distribution transformer. 

Comparisons of the performed measurements to an analytical 

eddy current model showed only poor correlation 

due to the simplifications made. In [8], [9] and [10] the 

losses have been evaluated using a resistance network 

analogy. The circuit parameters have been derived from 

small-scale models based on simple geometries. 

In order to enable studies on more complicated structures 

like an electrical machine, a novel method based 

on a three-dimensional (3-D) finite element analysis 

is proposed in this work. The method is capable to 

compute the true paths of the eddy currents and values 

of the ensuing losses. A full 3-D model with multiple 

joined laminations is avoided by introducing appropriate 

boundary conditions on the interlaminar contact surface 

of a single sheet model. 

This paper is organized as follows: In section II a 

novel 3-D eddy current model considering the effects of 

interlaminar short circuits is introduced. In section III 

the method presented is applied to the loss estimation 

of a megawatt rated slip-ring induction machine. First, 

the single sheet model without short circuits is compared 

to no-load iron loss measurements. The validation of 

the interlaminar contact model is performed using full 

models with several short circuited laminations. Finally, 

the effects of conductive joints and shearing burrs are 

subjected to an in-depth analysis. 

II. 3-D EDDY CURRENT MODEL 

The proposed method can be subdivided into two steps: 

first, a transient two-dimensional (2-D) field analysis is 

carried out for the whole machine. In a second step, a 

transient nonlinear 3-D eddy current problem is solved 

separately for an individual stator and rotor sheet. 

The 3-D model is excited by time-dependent boundary 

conditions obtained from the 2-D field analysis. This 

allows separate treatment of the stator and rotor sheets 

and avoids cumbersome and computationally expensive

transient 3-D finite element simulations including rotor 

motion [11]. Nevertheless, all loss relevant effects, like 

high-order field harmonics and the rotor movement are 

included in the analysis. 

Only a single sheet is considered in the 3-D model, 

the effects of interlaminar short circuits are taken into 

account by boundary conditions on the contact surface. 

This approach is valid insofar as, for multiple interconnected 

laminates, the electromagnetic quantities in the 

sheets within the core become periodic. 

A nodal based A,V -A formulation is employed for the 

analysis of the 2-D problem. The 2-D iron regions are assumed 

to be non-conductive. The massive rotor windings 

are modeled as eddy current domains but not the stator 

conductors. The starting transients of the machine are 

bypassed by using the initial steady state solution from 

a time harmonic analysis. The 3-D eddy current problem 

is solved solely in the conductive laminations using the 

A,V formulation based on isoparametric hexahedral edge 

elements with quadratic shape functions [12]. Introducing 

the magnetic vector potential A, in the whole problem 

domain Ω and the electric scalar potential V , in the eddy 

current region Ωc as 

B = ∇×A, 

E = − ∂ ∂ 

A − 

∂t ∂t ∇V 

⎫ 

⎬ 

⎭ in Ωc, (1) 

B = ∇×A in Ω − Ωc 

(2) 

where B is the magnetic flux density, E is the electric 

field intensity and t is time, the Maxwell equations under 

quasistatic approximation can be written as 

∇×ν∇×A + σ ∂ ∂ 

A + σ ∇V = 0, 

∂t ∂t 

∇· σ ∂ ∂ 

A + σ 

∂t ∂t ∇V 

⎫ 

⎪⎬ 

in Ωc, (3) 

=0 ⎪⎭ 

∇×ν∇×A = J0 in Ω − Ωc. (4) 

Herein ν denotes the reluctivity and σ the conductivity of 

the sheet material, J0 represents the given current density 

of the stranded coils in the 2-D model. 

Fig. 1 shows the specifications of the boundary conditions 

for the 3-D model of a stator sheet. The boundary 

conditions are prescribed anew for every time step. On 

the boundaries along the lamination thickness, the model 

is excited by the normal component of B derived form 

the 2-D field analysis. The prescription of the normal 

component of B is equivalent to specifying the tangential 

component of A. This component is obtained from the 

2-D field vector potential A2-D and is assumed to be 

constant along the thickness. The boundaries along the 

sheet thickness are hereinafter called axial boundaries. 

Since flux in the axial direction is neglected, the tangential 

component of A is set to zero on the boundaries 

parallel to the laminations. 

Setting the electric scalar potential free on the boundaries 

of Ωc denoted by Γi ensures the satisfaction of the 


Fig. 1. Specification of the boundary conditions for a 3-D stator sheet 

model. The sheet thickness is increased for better visibility. Symmetry 

boundary conditions on the bottom surface allow to consider only half 

of the thickness. 

natural boundary condition of vanishing normal component 

of the current density J here. At the interlaminar 

contact surface Γc, the current flow is assumed to be 

normal to the face, resulting in a constant and unknown 

electric scalar potential. Furthermore, it has to be guaranteed 

that the exchanged current through the contact spots 

between the sheets is zero by introducing the surface 

integral relation 

 

Γc 

σ 

 

∂ ∂ 

A + 

∂t ∂t ∇V 

 

· n dΓ = 0 (5) 

as an additional equation into the finite element equation 

system [13]. The symbol n stands for the outer normal 

vector. 

For interlaminar contacts occurring on the sheet boundaries 

like shearing burrs, the effects of the high circulating 

eddy currents on the given magnetic boundary field 

cannot be neglected. In such cases, the magnetic field 

density is not prescribed directly on the axial sheet 

boundaries, but on an additional outer finite element layer 

surrounding the lamination. This layer is modeled as nonconductive 

extending the 3-D problem to an A,V -A one. 

Thus, the required independence of the prescribed field 

from the eddy currents is ensured again. 

III. APPLICATION AND RESULTS 

The method presented has been applied to the loss 

estimation of a megawatt rated, 50 Hz, 690 V, deltaconnected 

three-phase, four pole slip-ring induction machine 

fed by sinusoidal voltage. In the case of a healthy 

machine without short circuits present in the core, the 

computed total iron losses are compared to no-load 

measurements. After validating the proposed interlaminar 

contact model against full models with many short 

circuited sheets, the additional losses due to conductive 

joints and shearing burrs will be quantified.


(a) (b) 

Fig. 2. Eddy current loss density distribution in a stator (a) and rotor (b) sheet at rated no-load current and a specific time instant. 

A. No-load iron losses 

According to the statistical loss theory, the total iron 

losses are composed of eddy current, hysteresis and 

excess losses [14]. The computed 3-D eddy current 

loss distribution for a healthy stator and rotor sheet 

is presented in Fig. 2. The losses are quite uniformly 

distributed in the stator sheet and mainly attributable to 

the 50 Hz fundamental frequency component. The rotor 

losses are concentrated in the vicinity of the air-gap 

and primarily evoked by the first stator slot harmonic at 

1800 Hz. In order to cover all relevant harmonic effects, 

the time step size Δt was fixed to Δt = T/500 for the stator 

and Δt = T/1000 for the rotor sheet simulation; T is the 

time period of the fundamental frequency. The total eddy 

current losses are obtained by integrating the product of 

E and J over the sheet volume, averaged over time and 

multiplied by the number of laminations. The proportion 

of the losses in the rotor core is remarkable and nearly 

as high as in the stator (see Fig. 3(a)). 

The measured and computed no-load iron losses as 

a function of the supply current are compared in Fig. 

3(b). During the measurements, the rotor windings were 

(a) (b) 

Fig. 3. Simulated no-load eddy current losses (a) and separated total 

iron losses (b) as a function of the supply current as well as measured 

losses. 

kept open in order to avoid additional rotor currents 

and hence further losses. The test machine was driven 

by a second machine at synchronous speed. Thereby, 

the friction losses are covered by the driving machine. 

The measured losses on the stator terminals of the test 

machine correspond to the iron losses and stator winding 

losses. For the computation, the hysteresis and excess 

losses are obtained by a method developed in [15] based 

on the evaluation of the shape of dynamic magnetization 

curves. The hysteresis losses are calculated by a static 

vector Preisach model [16], [17], the excess losses using 

the statistical loss theory [14]. The good agreement 

between the measurement and simulation results confirms 

that the methods used are able to cope with the complex 

electromagnetic phenomena arising in an induction machine, 

i. e. rotating flux and high-order field harmonics. 

The stator eddy current losses for no-load can even be 

evaluated correctly using time harmonic analyses, since 

they are almost exclusively caused by the fundamental 

field component. The methodology of the proposed twostep 

approach can be adopted in a straightforward way 

to time harmonic problems. For rated no-load current 

(I0=288 A), the losses obtained from a transient analysis 

are 901.1 W, the time harmonic analysis yields 903.7 W. 

When load is getting applied, the field harmonics increase 

strongly and transient analyses are required [18]. In 

order to shorten the computation time, the following 

investigations on the effects of interlaminar short circuits 

have been performed for the stator sheet at rated noload 

current using time harmonic calculations. However, 

all relevant factors influencing the behavior underlying 

an interlaminar short circuit are still incorporated in the 

computation. The use of time harmonic analyses constitutes 

no restriction of the introduced approach which is 

easily expandable for transient simulations. 

B. Validation of the interlaminar model 

The proposed single sheet model combined with 

boundary conditions considering the interlaminar interaction 

has been validated using two different examples: 

a simple conductive ring and a stator sheet sector of the 

induction machine investigated. 

1) Conductive ring: As shown in Fig. 4(a), the 2-D 

model of the ring is excited by a conductor arranged 

symmetrically in the bore separated from the core by 

an air-gap. The 3-D model (see Fig. 4(b)) consists of 

ten and a half stacked lamination quarters short circuited 

by two conductive joints through the entire core stack. 

The joint material is the same as those of the laminates. 

The isolation between the sheets is modeled as a nonconductive 

A-region with a relative permeability of one 

and a thickness of one hundredth of those of the laminations. 

On the curved boundaries, the 3-D problem is

(a) (b) 

Fig. 4. Validation geometry of the conductive ring: 2-D (a) and 3-D 

(b) model. 

excited by boundary conditions derived form the 2-D 

field solution. Periodic boundary conditions have been 

applied on the left and right boundaries. On the top 

surface, the normal component of B and that of J are set 

to zero; on the bottom surface, the problem is mirrored 

using symmetric boundary conditions. The 2-D problem 

is treated with the time harmonic A formulation, the 

3-D one with the A,V -A formulation. Linear material 

properties are assumed for the sake of simplicity. 

Figs. 5(a) and 5(b) show the current and flux density 

distributions computed by the full model. The electromagnetic 

quantities in the center sheets repeat periodically 

suggesting the applicability of the reduced 

approach. Consequently, the currents circulating through 

the contact spots affect the original 2-D field distribution 

(see Figs. 5(c) and 5(d)). The undermost half lamination 

serves as a validation reference for the reduced model 


with a single sheet. Fig. 6 compares the obtained field 

solutions. The good agreement is also verified in terms 

of losses wich are 10.93 mW for the full model and 

11.37 mW for the reduced one. 

2) Stator sheet sector: Since 3-D simulations with 

multiple short circuited stator laminations are hardly 

feasible, only a small sheet sector is considered in the 

full model. Fig. 7(a) shows the used 2-D model of 

the previously examined induction machine, Fig. 7(b) 

the 3-D model comprising hundred and a half laminations 

interconnected by shearing burrs over two teeth 

throughout the stack. A continuous burr width of 100 μm 

(a) (b) 

Fig. 7. Validation geometry of the stator sheet sector: 2-D (a) and 

3-D (b) model. 

has been selected requiring a rather fine mesh near the 

burred region. Results for different burr widths will be 

discussed in section III-C2. The burr material properties 

are the same as that of the sheets. The isolation thickness 

is specified as one fiftieth of the lamination thickness 

(a) (b) 

(c) (d) 

Fig. 5. Current density (a) and flux density (b) at a specific time instant for the full model as well as the corresponding vector plots (c,d) for the 

framed sectors.


(a) (b) 

(c) (d) 

Fig. 6. Current density and flux density distribution computed for the bottom lamination of the full model (a,b) and the reduced one (c,d). 

(d=0.5 mm). The 2-D field solution is prescribed on an 

outer non-conductive layer enclosing the teeth and on the 

stator back. The normal component of both B and J is 

set to zero on the top surface and on the boundaries intersecting 

the yoke. Again, the problem region is mirrored 

at the bottom surface and solved in the frequency domain 

using the A,V -A formulation. Linear media are used for 

validation purposes, nonlinearity is considered in the next 

sections by means of well established methods. 

The eddy current distribution for rated no-load current 

calculated by the full model is shown in Fig. 8. In 

the burr region, high currents are closing over the teeth 

(see also Fig. 12). The induced currents increase with 

Fig. 8. Current density distribution at rated no-load current and a 

specific time instant for the full model. 

the number of short circuited laminations and approach 

asymptotically a final value. Accordingly, the reduced 

model presents a worst case scenario for a suitably 

large number of interconnected sheets. The required sheet 

number depends on various factors, such as size and 

position of the short circuits, sheet dimensions or material 

properties. The undermost lamination of the full model 

and the reduced method again give similar current density 

distributions as shown in Fig. 9. The resulting losses 

for the full model are 392.0 mW, the reduced approach 

predicts 363.9 mW. 

C. Effects of interlaminar short circuits 

Two examples of interlaminar short circuits will be 

addressed in more detail. The first one involves conductive 

joints represented by clamping bars. Second, the 

impact of shearing burrs will be discussed by means of 

parametric studies. All of the following simulations have 

been carried out for rated no-load current in the frequency 

domain using the proposed interlaminar model. Nonlinearity 

is taken into account by an effective reluctivity 

approach [19]. 

1) Conductive joints: The iron stacks of mid-power 

range machines are often axially fixed by clamping bars 

installed after pressing the core. These clamping plates 

are attached uninsulated on the core back leading to 

a conductive connection among the sheets. Different 

numbers of bars installed along the stator back circumference 

have been investigated using the method presented. 

Fig. 10 demonstrates the computed current and flux 

density distribution when four clamping bars per pole 

pitch are considered in the simulation. On the one hand, 

the induced currents in axial direction cause additional 

losses. On the other hand, they oppose the original field 

distribution increasing the magnetic flux density and 

hence the losses near the clamping areas. A detailed view 

of the electromagnetic quantities near a contact area is 

shown in Fig. 11. The eddy current losses in the stator 

core increase by 3.93 % for one installed connection bar 

per pole. For four bars they rise by 13.65 %.


(a) (b) 

Fig. 9. Current density distribution for the bottom lamination of the full (a) and reduced (b) model. 

(a) (b) 

Fig. 10. Maximal flux density (a) and loss density (b) distribution for four clamping bars per pole installed. 

(a) (b) 

(c) (d) 

Fig. 11. Current density (a) and flux density (b) in a contact area at a specific time instant as well as the corresponding vector plots in the bar 

section (c,d). 

2) Shearing burrs: The quantification of the extra 

losses caused by edge burrs is carried out performing 

simulations for different numbers of teeth afflicted by 

burrs and various burr widths. Contrary to the former 

validation example, a complete stator sheet quarter has 

been considered in the computations. Fig. 12 shows the 

eddy current paths in the burr layer for different numbers 

of teeth burred. The high magnetizing flux in the tooth 

area induces interlaminar currents enclosing the teeth and 

resulting in a steep rise of the losses. The trends in Fig. 

13 indicate a cubic dependence of the additional losses on 

the number of burred teeth. The application of burrs on 

the stator back led to no significant loss increase owing 

to the small flux density near the outermost rim.


(a) (b) 

(c) (d) 

Fig. 12. Current density vector plots for one (a), two (b) three (c) and five (d) burred teeth at a specific time instant and a burr width of 100 μm. 

The burr region only is considered in the plot. 

Eddy Current Losses per Sheet Pcl in W 

10 

9 

8 

7 

6 

5 

4 

3 

2 

1 

bburr =10μm 

bburr =20μm 

bburr =40μm 

bburr =60μm 

bburr =80μm 

bburr = 100 μm 

0 

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 

Number of Teeth Burred 

Fig. 13. Losses as a function of the number of burred teeth for different 

burr widths. 

The width of the burr layer has been varied in a 

practically relevant range from 10 to 100 μm. Even 

broader interlaminar contacts can be present in laser cut 

sheets due to the heat induced insulation burn-off. As 

shown in Fig. 14, the losses rise linearly with the burr 

width. The loss behavior for different no-load supply 

currents can be seen in Fig. 15. The losses increase 

quadratically for lower values, whereas at higher currents, 

saturation effects occur limiting the maximal attainable 

flux densities and thus losses. 

IV. DISCUSSION AND CONCLUSIONS 

The method presented enables to compute the true 3- 

D eddy current distribution underlying an interlaminar 


10 

9 

8 

7 

6 

5 

4 

3 

2 

1 

1 tooth burred 

2 teeth burred 




0 

10 20 30 40 50 60 70 80 90 100 

Burr Width bburr in μm 

Fig. 14. Losses as a function of the burr width for different numbers 

of burred teeth. 

short circuit allowing a quantitative assessment of the 

arising additional losses. In order to avoid full models 

comprising multiple short circuited laminations, only a 

single sheet is considered. The interlaminar interaction is 

taken into account by boundary conditions on the contact 

surface using a generalized A,V -A formulation. 

First, the method has been applied to the no-load 

iron loss estimation of a slip-ring induction machine. 

Therefore, simulations for a healthy machine with no 

short circuits present in the core have been carried out. 

The transient 3-D eddy current problem has been excited 

by boundary conditions derived from a classical 2-D field 

analysis and solved separately for a stator and rotor sheet. 

The hysteresis and excess losses required for comparisons


2 

1.5 

1 

0.5 

bburr =10μm 

bburr =20μm 

bburr =40μm 

bburr =60μm 

bburr =80μm 

bburr = 100 μm 

0 

50 100 150 200 250 300 350 400 

No-Load Current I0 in A 

Fig. 15. Losses as a function of the supply current for two burred 

teeth and different burr widths. 

to no-load iron loss measurements have been evaluated 

by a static Preisach vector model and the statistical 

loss theory, respectively. Good agreement was obtained 

between measured and simulated results. 

The effects of interlaminar short circuits have been 

studied for the stator sheet at rated no-load using timeharmonic 

computations, since it was found that the occurring 

eddy current losses are almost completely caused 

by the fundamental field component. Comparisons of the 

interlaminar contact model against full models with many 

laminations joined confirmed the validity of the reduced 

method as long as the electromagnetic quantities in the 

short-circuited sheets stay periodic. For a low number of 

interconnected laminations, the full model yields lower 

losses than the reduced one. Consequently, the reduced 

model constitutes a worst-case approximation. 

Applications of the proposed approach revealed a 

significant loss increase for conductive paths introduced 

by shearing burrs on the tooth edges. It should be noted 

that the burrs studied will not be present to such an extent 

in a healthy machine, but the trends and findings will 

still apply, indicating the strong necessity to minimize 

bur-induced short circuits as far as practicable. 

The magnetic and electric properties have been assumed 

to be unaffected by the manufacturing process. In 

[20], [21] and [22] it was shown that especially the magnetic 

permeability near the edges can vary considerably 

due to the mechanical stress applied during punching. 

The incorporation of these effects as well as combinations 

of the developed method with measurement-based 

statistics have to be carried out in future work. 

V. ACKNOWLEDGMENT 

This work has been supported by the Christian 

Doppler Research Association (CDG) and by the ELIN 

Motoren GmbH. 

REFERENCES 

[1] P. Beckley, Electrical Steels for Rotating Machines. The Institution 

of Engineering and Technology, 2002. 


[2] M. C. Marion-Pera, A. Kedous-Lebouc, T. Waeckerle, and B. Comut, 

“Characterization of SiFe Sheet Insulation,” IEEE Transactions 

on Magnetics, vol. 31, no. 4, pp. 2408–2415, 1995. 

[3] A. C. Beiler and P. L. Schmidt, “Interlaminar Eddy Current Losses 

in Laminated Cores,” Transactions of the American Institute of 

Electrical Engineers, vol. 66, pp. 872–78, 1947. 

[4] C. A. Schulz, S. Duchesne, D. Roger, and J.-N. Vincent, “Capacitive 

short circuit detection in transformer core laminations,” 

Journal of Magnetism and Magnetic Materials, vol. 320, pp. 

e911–e914, 2008. 

[5] A. J. Moses and M. Aimoniotis, “Effects of Artificial Edge Burrs 

on the Properties of a Model Transformer Core,” Physica Scripta, 

vol. 39, pp. 391–393, 1989. 

[6] R. Mazurek, P. Marketos, A. Moses, and J.-N. Vincent, “Effect 

of Artificial Burrs on the Total Power Loss of a Three-Phase 

Transformer Core,” IEEE Transactions on Magnetics, vol. 46, 

no. 2, pp. 638–641, 2010. 

[7] R. Mazurek, H. Hamzehbahmani, A. Moses, P. I. Anderson, F. J. 

Anayi, and B. Thierry, “Effect of Artificial Burrs on Local Power 

Loss in a Three-Phase Transformer Core,” IEEE Transactions on 

Magnetics, vol. 48, no. 4, pp. 1653–1656, 2012. 

[8] D. A. Jones and W. S. Leung, “A theoretical and analogue 

approach to stray eddy-current loss in laminated magnetic cores,” 

Proceedings of the IEE - Part C: Monographs, vol. 108, no. 14, 

pp. 509–519, 1961. 

[9] C. A. Schulz, D. Roger, S. Duchesne, and J.-N. Vincent, “Experimental 

Characterization of Interlamination Shorts in Transformer 

Cores,” IEEE Transactions on Magnetics, vol. 46, no. 2, pp. 614– 

617, 2010. 

[10] J.-P. Bielawski, S. Duchesne, D. Roger, C. Demian, and T. Belgrand, 

“Contribution to the Study of Losses Generated by Interlaminar 

Short-Circuits,” IEEE Transactions on Magnetics, vol. 48, 

no. 4, pp. 1397–1400, 2012. 

[11] K. Yamazaki and N. Fukushima, “Iron-Loss Modeling for Rotating 

Machines: Comparison Between Bertotti’s Three-Term Expression 

and 3-D Eddy-Current Analysis,” IEEE Transactions on 


[12] O. Bíró, “Edge element formulations of eddy current problems,” 

Computer Methods in Applied Mechanics and Engineering, vol. 

169, no. 3-4, pp. 391–405, 1999. 

[13] I. Bakhsh, O. Bíró, and K. Preis, “Skin effect problems with 

prescribed current condition,” in Proceedings of the 14 th Int. IGTE 

Symp. on Numerical Field Calculation in Electrical Engineering, 

2010. 

[14] G. Bertotti, Hysteresis in Magnetism. Academic Press, 1998. 

[15] E. Dlala, “A Simplified Iron Loss Model for Laminated Magnetic 

Cores,” IEEE Transactions on Magnetics, vol. 44, no. 11, pp. 

3169–3172, 2008. 

[16] E. Dlala, J. Saitz, and A. Arkkio, “Inverted and Forward Preisach 

Models for Numerical Analysis of Electromagnetic Field Problems,” 

IEEE Transactions on Magnetics, vol. 42, no. 8, pp. 1963– 

1973, 2006. 

[17] E. Dlala, “Efficient Algorithms for the Inclusion of the Preisach 

Hysteresis Model in Nonlinear Finite-Element Methods,” IEEE 

Transactions on Magnetics, vol. 47, no. 2, pp. 395–408, 2011. 

[18] E. Dlala, O. Bottauscio, M. Chiampi, M. Zucca, A. Belahcen, and 

A. Arrkio, “Numerical Investigation of the Effects of Loading and 

Slot Harmonics on the Core Losses of Induction Machines,” IEEE 

Transactions on Magnetics, vol. 48, no. 2, pp. 1063–1066, 2012. 

[19] G. Paoli and O. Bíró, “Time harmonic eddy currents in nonlinear 

media,” COMPEL: The International Journal for Computation 

and Mathematics in Electrical and Electronic Engineering, 

vol. 17, no. 5/6, pp. 567–575, 1998. 

[20] F. Ossart, E. Hug, C. Hubert, Olivier Buvat, and R. Billardon, 

“Effect of punching on electrical steels: Experimental and numerical 

coupled analysis,” IEEE Transactions on Magnetics, vol. 36, 

no. 5, pp. 3137–3140, 2000. 

[21] A. Schoppa, J. Schneider, and J.-O. Roth, “Influence of the cutting 

process on the magnetic properties of non-oriented electrical 

steels,” Journal of Magnetism and Magnetic Materials, vol. 215- 

216, pp. 100–102, 2000. 

[22] K. Fujisaki, R. Hirayama, T. Kawachi, S. Satou, C. Kaidou, 

M. Yabumoto, and T. Kubota, “Motor Core Iron Loss Analysis 

Evaluating Shrink Fitting and Stamping by Finite-Element 

Method,” IEEE Transactions on Magnetics, vol. 43, no. 5, pp. 

1950–1954, 2007.


Evaluating the influence of manufacturing 

tolerances in permanent magnet synchronous 

machines 

I. Coenen, T. Herold, C. Piantsop Mbo’o, and K. Hameyer 

Institute of Electrical Machines, RWTH Aachen University, Schinkelstrasse 4, D-52056 Aachen, Germany 

E-mail: isabel.coenen@iem.rwth-aachen.de 

Abstract—Manufacturing tolerances can result in an unwanted behavior of electrical machines. Undesired parasitic effects 

such as torque ripples may be increased. A quality control of machines subsequent to manufacturing is therefore required 

in order to test whether the machines comply with its specifications. This is useful to ensure a high reliability of the 

manufactured machines. This paper describes the consideration of rotor tolerances due to non-ideal manufacturing processes. 

The idea is to estimate the influence of the manufacturing tolerances for realization of a reliable quality control. To study 

various fault scenarios numerical field simulations are employed which are parameterized by measurements. 

Index Terms—electrical machines, Finite Element Analysis (FEA), manufacturing tolerances, quality control. 


The reliability of electrical drives [1] is an important 

aspect to ensure a high availability. In industrial applications, 

permanent magnet excited synchronous machines 

(PMSM) are widely employed as they offer advantages 

in efficiency and power density. However, especially the 

rotor of PMSMs is susceptible to tolerances caused during 

mass production. Variations from the ideal machine 

influence its operational behavior [2]. Therefore, it is 

important to verify the machine’s quality prior to its 

installation. 

Reliable and widely used diagnostic methods are vibration 

and current monitoring [3]. In this study, electrical 

quantities are focused because this offers the advantage 

that no additional sensors need to be installed [4]. 

A. Proposed monitoring setup 

The most often proposed end-of-line test is back-EMF 

monitoring [5], [6]. Fig. 1 shows a possible setup for its 

realization. Here, the motor under test is driven under 

open-circuit conditions. For attenuation of the drive’s 

influence, a flywheel is employed between drive and 

motor under test. 

 

 

 

 

 

 

Fig. 1. Back-EMF monitoring setup. 

 

 

This approach presents a non invasive monitoring 

method being benefical for diagnosis. However, such a 

setup is very expensive. It is cost-expensive because a 

drive is required and a certain device is needed to damp 

possible influences of the drive. Above all, it is timeexpensive 

due to the fact that the motor under test is 

mechanically coupled to the drive. This is not an efficient 

solution when a large number of machines needs to be 

tested during mass production. 

In this study, an additional approach is investigated 

where the current is being monitored. The corresponding 

setup is shown in Fig. 2. Here, a start-up of the motor 

up to a certain speed is performed in such a way that the 

current is measured at various speeds. The benefit of this 

method is its time- and cost-saving setup. When compared 

to the back-EMF setup, less hardware is needed. 

No mechanical coupling to a drive is required, simply 

the motor is connected to the inverter. 

However, for evaluating the current, it needs to be considered 

that the current is a controlled quantity. Impacts 

caused by the control system or the inverter supply might 

lead to misinterpretation of the results. 

 

 

 

 

Fig. 2. Current monitoring setup. 

 

 

In the following, the back-EMF and current characteristic 

of a PMSM is determined. In order to study various 

fault scenarios, numerical field calculation is employed 

considering tolerance affected rotor components. The aim 

is to evaluate the influence of such tolerances to be able to 

determine distinguishing characteristics. This information

can be helpful to develop an appropriate end-of-line test 

and to reveal which of the proposed setups is most 

qualified. 

II. MOTOR UNDER STUDY 

The machine studied within this work is a three-phase 

permanent magnet synchronous machine with tooth-coil 

winding system. It presents six stator slots and four pole 

pairs p. The eight magnets of the rotor are arranged in a 

spoke configuration. 

III. INFLUENCE OF ROTOR TOLERANCES 

During the manufacturing process material dependant 

failures, geometrical or shape deviations may occur. Such 

tolerances influence the machine’s behavior. For instance, 

increased torque ripples are caused [7]. 

The considered tolerances within this paper concern 

the magnet’s material and its dimensions. The magnetization 

faults are illustrated Fig. 3. Possible deviations 

affect the magnitude of the remanence flux density BR 

and the angle β of the magnetization direction. Further 

Fig. 3. Magnetization faults. 

BR 

examples of rotor tolerances, not considered within this 

study, would be a displacement of the magnet and rotor 

eccentricity. 

A. Theoretical analysis 

Within this study, the influence of rotor tolerances onto 

electrical signals is focused. According to [5], for nonideal 

rotor components new harmonic orders nrf are 

expected to appear in the back-EMF spectrum which are 

a function of the pole pair number p: 

nrf =1± k 

with k ∈ N. (1) 

p 

In the following, this relation shall be approved and 

specialized for the certain machine investigated. 

The back-EMF Vi is the induced voltage at no load 

condition (open circuit). For a coil with w numbers of 

turns Vi is defined as follows: 

Vi = −w dφ d 

= −w 

dt dt ( 

 

Bd A). (2) 

Applied to a machine’s winding, it means that the 

back-EMF in one coil of the winding is determined by the 

air gap flux density B. Therefore the back-EMF presents 

the same harmonic orders which appear in the spectrum 

of the flux density. The latter will be considered for an 

order analysis. 

β 


The magnetic flux density at the air gap of the machine 

is a rotating wave which is a function of relative position 

at the air gap α and time t [8]. It is given as the product 

of the magnetomotive force Θ (MMF) and the air gap 

permeance Λ: 

B(α, t) =Θ(α, t) · Λ(α, t). (3) 

The functions of permeance, magnetomotive force and 

flux density can generally be represented by a series of 

space and time harmonics [9]: 

Λ(α, t) = 

Λyl,zl · cos(yl · α − zl · t), (4) 

yl,zl 

Θ(α, t) = 

yt,zt 

B(α, t) = 

yb,zb 

Θyt,zt · cos(yt · α − zt · t), (5) 

Byb,zb · cos(yb · α − zb · t). (6) 

At this, ω1 is the supplying angular frequency. For 

reasons of illustration the phase angle is neglected. 

For derivation of the new harmonic orders caused by 

non-ideal rotor components only the rotor fundamental 

component of the MMF is considered, meaning yt = p 

and zt = ω1. Furthermore, a constant air gap width is 

considered, meaning yl =0and zl =0. For the faultless 

case, this implies: 

Bp(α, t) =Bp,ω1 · cos(p · α − ω1 · t). (7) 

The above mentioned rotor tolerances lead to an 

asymmetrical distribution of the air gap field. With the 

described approach, this means a modulation of the 

MMF caused by the rotor magnets. New space and time 

harmonics k with k ∈ N appear resulting in the following 

expression for the flux density considering non-ideal rotor 

components: 

Brf(α, t) = 

 

Bk · cos((p ± k) · α − (ω1 ± kωm) · t). 

k 

Here, ωm is the rotational speed with ωm = ω1 

p . Hence, 

Brf can be expressed as follows: 

Brf(α, t) = 

 

Bk · cos((p ± k) · α − (1 ± k 

) · ω1t). 

p 

k 

This expression indicates the new harmonic orders appearing 

in the spectrum of the flux density and equally 

in the back-EMF spectrum due to deviations at the machine’s 

rotor as predicted in expression (1). However, it 

is only valid considering one single coil [10]. Derivating 

the harmonics for one phase, the coil configuration needs 

to be considered. One phase of the investigated machine 

contains two coils which are displaced by 180 ◦ . 

According to (9) the back-EMF of the first coil in 

one phase assuming faulty rotor components can be 

(8) 

(9)

determined as follows: 

Virf1(α, t) = 

 

Vik · cos((p ± k) · 0 ◦ − (1 ± k 

) · ω1t). 

p 

k 

(10) 

Similarly, the back-EMF of the second coil in the same 

phase is: 

Virf2(α, t) = 

 

Vik · cos((p ± k) · 180 ◦ − (1 ± k (11) 

) · ω1t). 

p 

k 

The resulting back-EMF Virfph for one phase can be 

determined by the superposition of the two coils: 

Virfph = Virf1 + Virf2 = 

 

Vikcos((1 ± 

k 

k 

p )ω1t) · [1 + cos((p ± k) · 180 ◦ )]. 

(12) 

For odd numbers (p ± k), (12) is equal to zero. With 

p =4it is obvious that only even numbers of k appear 

in the back-EMF spectrum. 

Finally, (1) can be specialized for the analyzed machine, 

indicating the new harmonic orders appearing in 

the spectrum of back-EMF in case of non-ideal rotor 

components: 

n ′ rf =1± 2k′ 

p with k′ ∈ N. (13) 

For the faultless case where the air gap field is symmetrical, 

the appearing orders are determined by the winding 

arrangement [5]. Considering a three phase winding, 

these harmonic orders are: 

n =6m ± 1 with m ∈ N. (14) 

The mentioned new harmonic orders caused by faults 

appear in addition to (14). 

For the current, the harmonic orders can be derived 

analogously. Ampere’s law reveals the general relation 

between electrical current I and magnetic flux density 

B: 

 

μ0 · I = Bds. (15) 

In practice, the current may additionally be affected by 

the control system and by the inverter supply. These 

impacts need to be considered in order to avoid wrong 

interpretation of the measured signals. 

IV. METHODOLOGY 

To determine the influence of the rotor tolerances 

onto the back-EMF and current characteristic, numerical 

field simulations are used. Reliable analysis requires a 

sufficiently large number of experiments which means 

less effort performing with simulation instead of measurements. 

In addition, the interpretation of measurement 

results is difficult within this context, as for a certain 

prototype the real existing deviations are unknown. The 

intentional construction of tolerances is very difficult to 

realize. 

S 


Back−EMF [p.u.] 

A. Finite Element Analysis 

To calculate the back-EMF, a two-dimensional timestepping 

Finite Element Analysis (FEA) is applied. Noload 

operation at a speed of 3000 rpm is assumed and the 

voltage is calculated by use of the time derivative of the 

magnetic flux, as in equation (2). 

For analysis, a discrete Fourier transform (DFT) is performed 

which yields the spectrum of back-EMF as shown 

in Fig. 4. For the ideal faultless case with symmetrical 

air gap field, harmonic orders appear according to (14). 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0 

0 2 4 6 8 

Harmonic order 


0.05 

0.04 

0.03 

0.02 

0.01 

0 

0 2 4 6 8 


Fig. 4. Back-EMF spectrum assuming faultless case. 

1) Parameterization by statistical measurements: For 

parameterization of the FEA model, a statistical verification 

is performed. The back-EMF is measured for ten 

prototypes of the machine. The resulting first order is 

evaluated in form of a histogram shown in Fig. 5. Based 

on this results, the magnets material properties (BR) 

within the model are adjusted in order that simulated 

value of first order and mean value of measured first 

order agree. 

Absolute Frequency 

5 

4 

3 

2 

1 

0 

0.99 1 1.01 

First order back−EMF [p.u.] 

Fig. 5. Measured Back-EMF histogram. 

B. Extended d-q model 

A d-q model is a common way to describe the PMSM’s 

dynamical behavior considering the control system. Here, 

the application of such a model is studied to calculate 

the current. Since the common d-q model only takes the 

fundamental wave into account, it is not able to consider 

non-ideal behavior such as local defects studied within 

this work. Hence, an extended d-q model [11] is applied

to calculate the current. Here, FEA is used to extract 

additional elements to extend the common d-q equations. 

In the following, a start-up of the machine from zero to 

3000 rpm is simulated and the stator current is analyzed 

by use of a short-time Fourier transform (STFT). This 

yields the spectrum including the frequency distribution 

over time of the non-stationary current signal. Fig. 6 

shows the result for the faultless case. The value of 

current is represented by a color range, where light colors 

mean a high and dark colors a low value. It can be seen 

that the harmonic orders are the same as for the back- 

EMF. 

Fig. 6. Current spectrum assuming faultless case. 

Here, sine-wave excitation is assumed. With the presented 

model it is also possible to simulate inverter 

operation. However, modeling the inverter leads to a 

computationally expensive model. Fig. 7 shows the result 

for the faultless case assuming inverter supply. Due to 

the high intensity of computation it is illustrated with 

lower resolution. Besides the main harmonic orders some 

new orders appear. However, these do not interfere with 

the orders which are expected to appear correspending to 

(13) due to tolerances. Therefore, inverter supply is not 

considered within this study because of the computational 

costs. 

Fig. 7. Current spectrum assuming faultless case and inverter supply. 

V. SIMULATION RESULTS 

To develop a reliable end-of-line check, the most 

common and important fault modes should be captured. 

In the following, different approaches are applied to 



simulate various fault scenarios. The choice of the corresponding 

approach depends on the particular fault, the 

prior knowledge of the fault and the available data. 

A. Worst-case analysis 

For PMSMs cogging torque is an undesired effect as it 

leads to rotational oscillations of the drive train. Cogging 

torque is strongly influence by deviations caused by 

the manufacturing process [2],[7]. However, measuring 

cogging torque is very time- and cost-expensive [12] and 

therefore no appropiate method for an end-of-line control. 

In the following it shall be studied how back-EMF and 

current are influenced at a faulty machine presenting a 

high value of cogging torque due to magnetization faults. 

Considering a deviation in the magnitude of the magnets’ 

remanence flux density BR, the amount of variation 

of cogging torque is depending on which and how 

many permanent magnets are affected. In [13] Designof-Experiments 

is applied to find out the worst-case 

configuration of magnetization faults concerning cogging 

torque. Applied to the studied machine, the configuration 

shown in Fig. 8 presents the highest value of peak-to-peak 

cogging torque. Thereby, the filled magnets represent the 

ones that are defective. Considering a deviation at BR 

of -10%, the value of cogging torque is about seventimes 

higher compared to the reference value of the ideal 

machine. 

Fig. 8. Worst-case configuration of magnetization faults. 

Fig. 9 and Fig. 10 show the results for back-EMF and 

current in case of the described worst-case magnetization 

fault. 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

Faulty case 

Faultless case 

0 

0 2 4 6 8 



0.05 

0.04 

0.03 

0.02 

0.01 

Faulty case 


0 

0 2 4 6 8 


Fig. 9. Back-EMF spectrum assuming worst-case magnetization fault. 

The spectra show new harmonic orders, especially 

n ′ rf =0.5 and n′ rf =2.5 are apparent according to (1). 

Compared to the faultless case the first harmonic order 

is reduced.


Fig. 10. Current spectrum assuming worst-case magnetization fault. 

B. Sample cases 

For the studied machine the height of the magnet can 

vary between 97% and 100% of its desired value and 

the width can vary by ± 1.5%. The dimensions of some 

sample magnets have been measured which are used 

as input data for this approach. None of the measured 

dimensions exceed the allowed tolerance range. Five 

cases are created where every magnet is subjected to 

the given tolerances. Fig. 11 and Fig. 12 exemplarily 

show the back-EMF and current spectrum of one faulty 

case. The first harmonic order is reduced compared to 

the faultless case and the new ordinal numbers appear 

corresponding to (13). 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

Faulty case 


0 

0 2 4 6 8 



0.05 

0.04 

0.03 

0.02 

0.01 

Faulty case 


0 

0 2 4 6 8 


Fig. 11. Back-EMF spectrum assuming faulty magnet dimensions. 

Fig. 12. Current spectrum assuming faulty magnet dimensions. 

When compared to the results from the worst-case 

magnetization fault studied in V-A, one can see that the 


influence of deviations at the magnets’ dimensions within 

the allowed tolerance range is not significant. The spectra 

show the same specific charateristics but less amounts. 

For all five studied cases the simulated spectra do not 

differ considerably from the faultless case. 

C. Stochastic analysis 

In [14] the influence of varying qualities of the permanent 

magnet has been investigated applying a stochastic 

analysis. This is applied here to compare the influences 

of deviations in the magnetization magnitude and magnetization 

direction. Overall, 60 failure configurations are 

studied. For 20 cases the remanence flux density BR 

is assumed to be Gaussian distributed with a standard 

deviation of 3σ equal to 10% of the nominal value. 

For 20 other cases the magnetization direction is also 

assumed to be normally distributed with a standard 

deviation of 5 ◦ . The other 20 cases present both kind 

of deviations. Applying FEA, cogging torque and back- 

EMF are calculated and evaluated employing histograms. 

The distribution of the first harmonic order of the back- 

EMF is shown in Fig. 13. It shows the range in which 

the back-EMF is influenced because of the different 

magnetization deviations. 



Magnetization magnitude 

7 

6 

5 

4 

3 

2 

1 

0 

0.98 1 1.02 1.04 


7 

6 

5 

4 

3 

2 

1 

Magnitude and direction 

0 

0.98 1 1.02 1.04 



20 

15 

10 

5 

Magnetization direction 

0 

0.98 1 1.02 1.04 


Fig. 13. Back-EMF histogram considering magnetization faults. 

It can be seen that the influence of the deviations 

concerning the magnetization direction is very small. The 

magnitude fault is prevailing. This is to be expected for 

the studied machine as it presents interior magnets. 

The same analysis is performed for the peak-to-peak 

cogging torque, which is presented in Fig. 14. Again 

it can be concluded that the influence of magnetization 

direction is small when compared to the deviation in 

magnitude as the corresponding distribution shows.



8 

6 

4 

2 

Magnetization magnitude 

0 

0 2 4 6 

Peak−to−peak cogging torque [p.u.] 

8 

6 

4 

2 

Magnitude and direction 


0 

0 2 4 6 


20 

15 

10 

5 

Magnetization direction 

0 

0 2 4 6 


Fig. 14. Cogging torque histogram considering magnetization faults. 

Generally, magnetization faults influence cogging 

torque and back-EMF characteristic simultaneously [15]. 

For both quantities new harmonic orders arise depending 

on the caused asymmetry in the air gap field. Based 

on the presented studies, it becomes apparant that the 

influence of magnetization tolerances on cogging torque 

is more significant as on back-EMF. This means, that 

on the one hand the machine’s behavior is strongly 

influenced in general. But on the other hand, as a cogging 

torque test is excluded for an end-of-line concept, it 

implies high functional requirement of the measurement 

devices to detect faults by use of back-EMF analysis. 

This shows the importance of an influence analysis such 

as presented in this study. It is required to study the 

impact of tolerances to be able to separate it towards 

measurement inaccuracy. 


In this work, the influence of non-ideal manufactured 

rotor components of a PMSM on its back-EMF and 

current characteristic is studied. It has been shown that 

electrical quantities are applicable to realize tolerance 

diagnosis. Especially the stator current approves to be 

a promising approach due to its time- and cost-saving 

setup. The new harmonic orders caused by rotor faults 

are derived within a theoretical analysis and confirmed 

by the simulation results. 

An end-of-line check could be realized in such a 

way that all machines presenting a certain level in these 

specific characteristics are rejected. With the presented 

methods, the range of these distinguishing characteristics 

can be evaluated to detect the corresponding levels for 

rejection. At this, measurement accuracy should be taken 

into account. 


Comparing the different rotor tolerances, all present 

the same characteristics but with different amounts depending 

on the fault’s intensity and arrangement. As a 

feedback for manufacturing, a differentiation of various 

tolerances would be gainful but can not be achieved with 

the presented analysis. However, the focus of the endof-line 

check is to verify the machines’ quality which 

can be realized by the suggested approach. The results 

of this study evince to be valueable for application of an 

accurate quality control for PMSMs finally improving its 

reliability. 

REFERENCES 

[1] S. Nandi, H.A. Toliyat, and X. Li, ”Condition Monitoring and Fault 

Diagnosis of Electrical Motors - A Review,” IEEE Transactions on 

Energy Conversion, vol. 20, no. 4, pp. 710-729, December 2005. 

[2] L. Gasparin, A. Cernigoj, S. Markic, and R. Fiser, ”Additional 

Cogging Torque Components in Permanent-Magnet Motors Due 

to Manufacturing Imperfections,” IEEE Transactions on Magnetics, 

vol. 45, no. 3, pp. 1210-1213, March 2009. 

[3] P.J. Tavner, ”Review of condition monitoring of rotating electrical 

machines,” IET Electric Power Applications, vol. 2, no. 4, pp. 

215247, 2008. 

[4] W. le Roux, R. G. Harley, and T. G. Habetler, ”Detecting Rotor 

Faults in Low Power Permanent Magnet Synchronous Machines,” 

IEEE Transactions on Power Electronics, vol. 22, no. 1, pp. 322- 

328, January 2007. 

[5] D. Casadei, F. Filippetti, C. Rossi, A. Stefani, and D.J. Ewins, 

”Magnets faults characterization for Permanent Magnet Synchronous 

Motors,” IEEE International Symposium on Diagnostics 

for Electric Machines, Power Electronics and Drives, pp. 1-6, 2009. 

[6] A. Flach, F. Drager, M. Ayeb, and L. Brabetz, ”A New Approach to 

Diagnostics for Permanent-Magnet Motors in Automotive Powertrain 

Systems,” IEEE International Symposium on Diagnostics for 

Electrical Machines, Power Electronics and Drives, pp. 234-239, 


[7] G. Heins, T. Brown, and M. Thiele, ”Statistical Analysis of the 

Effect of Magnet Placement on Cogging Torque in Fractional Pitch 

Permanent Magnet Motors ,” IEEE Transactions on Magnetics, vol. 

47, no. 8, pp. 2142-2148, August 2011. 

[8] J.R. Cameron, W.T. Thomson, and A.B. Dow, ”Vibration and current 

monitoring for detecting airgap eccentricity in large induction 

motors,” IEE Proceedings B Electric Power Applications, vol. 133, 

no. 3, pp. 155 - 163, May 1986. 

[9] B.M. Ebrahimi, J. Faiz, and M.J. Roshtkhari, ”Static-, Dynamicand 

Mixed-Eccentricity Fault Diagnoses in Permanent-Magnet 

Synchronous Motors,” IEEE Transactions on Industrial Electronics, 

vol. 56, no. 11, pp. 4727-4739, November 2009. 

[10] J. Urresty, J. Riba Ruiz, and L. Romeral, ”A Back-emf Based 

Method to Detect Magnet Failures in PMSMs ,” IEEE Transactions 

on Magnetics, July 2012. 

[11] T. Herold, D. Franck, E. Lange, and K. Hameyer, ”Extension 

of a D-Q Model of a Permanent Magnet Excited Synchronous 

Machine by Including Saturation, Cross-Coupling and Slotting 

Effects,” International Electric Machines and Drives Conference 

(IEMDC), pp. 1379-1383, 2011. 

[12] C. Schlensok, D. van Riesen, B. Schmülling, M. Schöning, and 

K. Hameyer, ”Cogging Torque Analysis on Permanent Magnet 

Machines by Simulation and Measurement,” tm - Technisches 

Messen, vol. 74, no. 7-8, pp. 393-401, August 2007. 

[13] I. Coenen, M. van der Giet, and K. Hameyer, ”Manufacturing Tolerances: 

Estimation and Prediction of Cogging Torque Influenced 

by Magnetization Faults,” IEEE Transactions on Magnetics, vol. 

48, no. 5, pp. 1932-1936, May 2012. 

[14] I. Coenen, M. Herranz Gracia, and K. Hameyer, ”Influence and 

evaluation of non-ideal manufacturing process on the cogging 

torque of a permanent magnet excited synchronous machine,” 

COMPEL, vol. 30, no. 3, pp. 876-884, 2011. 

[15] K. Kim, S. Lim, D. Koo, and J. Lee, ”The Shape Design of 

Permanent Magnet for Permanent Magnet Synchronous Motor 

Considering Partial Demagnetization,” IEEE Transactions on Magnetics, 

vol. 42, no. 10, October 2006.


 

 

Hai Van Jorks, Erion Gjonaj and Thomas Weiland 

TU Darmstadt, Institute of Computational Electromagnetics, Schloßgartenstraße 8, 64289 Darmstadt, Germany 

Abstract— High frequency eddy currents are investigated and the Common Mode Input Impedance of a PWM controlled 

induction motor is calculated from finite element simulations. In order to determine machine parameters accurately, two 

modelling approaches are compared. The first is a two-dimensional simulation approach where iron core lamination effects 

are included by means of an equivalent material approximation. The second approach consists in fully three-dimensional 

analysis taking into account explicitly the eddy currents induced in the laminations. It is shown that homogenised equivalent 

material models may lead to large errors in the calculation of machine inductances, especially at high frequencies. However, 

the Common Mode Input Impedance, which is the final parameter of interest, seems to be less affected by the lamination 

modelling. 

Index Terms—eddy currents, finite element, lamination, PWM 


In modern drive systems fast switching inverters are 

the source of high frequency common mode voltages at 

the motor terminals. Due to stray capacitances between 

windings and grounded iron parts of the machine, a 

common mode current is excited and for its part may 

cause circulating bearing currents which may damage the 

bearing [1]. The phenomena can be described by 

equivalent circuit representation which employs the 

frequency dependent Common Mode Input Impedance 

being the ratio of common mode voltage and current 

 

Common Mode Input Impedance can be computed 

using 

 

 

Figure 1: Lumped parameter model of a two conductor system. 

Parameters can be gathered in the impedance and capacitance matrix 

Firstly, the stray capacitances are extracted from 

electrostatic and winding impedances from 

magnetoquasistatic simulations. Ohmic conductivity of 

winding insulation is negligible. 

Secondly, the winding scheme is taken into account to 

match the corresponding voltages and currents at the 

front and rear end of the machine. At this point, endwinding 

inductances can be included in the model, but 

require distinct modelling approaches and are, therefore, 

neglected in the present analysis. 

Considering the laminated middle part of the motor, it 

is common to use two-dimensional (2D) models of the 

motor cross-section to compute the self and coupling 

impedances. In earlier literature it is proposed to apply 

the finite element (FE) method within a single stator slot 

while the magnetic field was assumed to be zero outside 

the slot perimeter [2]. However, as shown in a recent 

investigation [3], strong inductive coupling at high 

frequencies (several to ) may occur even 

between distant slots. The effect is caused by the core 

lamination, which, despite the small skin depth (7), 

promotes the spreading of high frequency magnetic 

fluxes over the iron sheets’ surfaces. In [4] the lamination 

was approximated by an additional impedance. 

Moreover, it is possible to include the lamination already 

in the FE model. Therefore, we consider the entire motor 

cross section in the magnetoquasistatic simulations. 

In the case of 2D analysis, modelling the laminated 

core in a plane requires the application of 

homogenization techniques. We investigated the accuracy 

of the widely used formulation (2) for a broad frequency 

range . The reference solution was 

obtained from fully three-dimensional (3D) simulations. 

Since resolving the small skin depth in motor laminations 

in a 3D mesh is computationally very costly, general 

purpose simulation software cannot be utilized. On that 

account, we developed a specialised 3D FE simulation 

tool, which takes advantage of the periodicity of the 

lamination stack, but otherwise does not use any 

approximation on motor geometry or on the material 

properties of the laminated core. 

II. 2D LAMINATION MODELLING 

A well-known homogenization model for laminated 

cores [5] utilizes a frequency dependent equivalent 

permeability for the iron core given by, 

 

 

 

 

 

 

 

where 0r is the permeability of iron, 2b the thickness of 

the plate and the skin depth at a given frequency. The 

magnetic field problem for the homogenized core is 

reduced to a planar problem. While this approach allows 

for efficient 2D FE modelling, accuracy at higher 

frequencies may not be sufficient. Figure 2 shows the Bfield 

plot of the motor model obtained from simulations

with “FEMM” [6], a 2D open source software which 

employs the approach (2). In order to obtain self and 

mutual impedances of the multi-conductor system, only 

one conductor was excited by a current. The spreading of 

the flux over the lamination as well as across the periodic 

boundary of the computational model can be observed. 

The impedance matrix was extracted and will be 

compared to the 3D reference in Section IV.B. 

Figure 2: 2D model of cross-sectional motor geometry (60° section) 

with magnitude of magnetic flux density at 1 MHz 

III. FULLY 3D FE ANALYSIS 

A. 3D FE formulation 

Maxwell’s equations in frequency domain expressed 

by a magnetic vector potential yield 

 

 

 

where is the permeability, the conductivity, is a 

voltage gradient used for excitation. Two types of 

boundary conditions on are applied. Firstly, 

 

where is the unit normal vector and is the 

 

 

 

 

 

 

 

 

 

triangular prims are chosen, because they allow for an 

efficient discretisation of the very thin iron sheet (Fig. 3). 

Applying Galerkin’s method to (3) the matrix equation 

 

is generated, where the complex matrix combines the 

discrete operators corresponding to the left hand side of 

(3), vector holds the degrees of freedom (DOFs) of 

the vector potentials and vector the exciting currents. 

In the case of voltage excitation of individual conductors 

the relation 

 


can be employed [7], where 

, is the 

 

angular frequency and is the coupling matrix for the 

vector of the wire voltages . The 

important case where a single conductor n is exited can 

be obtained by setting in (6) and for all 

other conductors (see also Section IV.A). 

B. Reduction of the problem size 

A high frequency 3D FE model of the complete motor 

geometry is still far beyond today’s computing capacities. 

But even if we consider just a slice of the motor 

with the thickness of half a lamination sheet , 

an appropriate 3D discretisation will lead to several 

million mesh cells. In order to model eddy currents at 

frequencies up to the discretisation has to 

resolve the small skin depth in the high-permeability iron 

 

 

 

In the analysis, the following parameters were used: 

, 

, 

where is is the electrical conductivity and the 

permeability of iron, respectively. 

Parallelization of our 3D FEM code is a key feature, 

nevertheless, further reduction of the problem size is 

particularly important. In the case of common mode 

excitation of a 3-phase 4-pole induction machine, the 

field pattern in the motor cross section is periodic with 

respect to a 60° rotation around the longitudinal axis of 

the motor. This reduces the computational domain to a 

60° section while periodic boundary conditions are 

applied to the cut planes (Fig. 3). 

Figure 3: Simulation mesh and magnitude of the magnetic flux density 

at 1 MHz for the 3D motor model. For better visibility, the model is 

scaled by a factor 100 in the transversal direction. 

IV. IMPEDANCE MATRIX CALCULATION 

A. Extraction procedure 

The standard procedure to extract the impedances of 

the cross-sectional conductors from FE analysis, is to 

excite the -th conductor with a current of and set all 

the other conductors to . After running the simulation, 

the induced voltages in all conductors have to be 

computed from the magnetic vector potential solution.

This procedure has to be repeated for all 

conductors. A section of the analyzed 

induction motor holds 120 stator and 8 rotor conductors. 

If a general purpose FEM software is used, this leads to a 

large computational overhead, which makes the method 

inconvenient. However, the extraction procedure can be 

condensed into a single simulation cycle. In this way, 

common steps like loading of the mesh, setup of the curlcurl 

matrix and its LU decomposition have to be 

performed only once (see Fig. 4). Referring to the 3D 

simulation of the motor cross section, a speedup factor of 

could be obtained. As was shown in [7], 

impedance matrix can be computed from 

 

Still, in order to avoid the explicit inversion of the sparse 

matrix , the equation system (5) has to be solved 

times, with being the number of conductors. A detailed 

overview of the implemented algorithm is depicted in 

Fig. 4. 

Figure 4: Flowchart of computational algorithm 

B. Simulation results 

The same motor geometry is used in the 2D (Fig. 2) 

and the 3D (Fig. 3) case. The simulation time of a 3D 

model with DOFs on a cluster with 60 nodes 

was for a sweep of 7 frequency points. Figure 5 

shows the magnitude of self-impedance of one stator 

conductor extracted from 2D and 3D simulations. The 2D 

solution which employs the lamination model (equivalent 


permeability) shows major discrepancies from the 3D 

reference case. For additional validation of the simulation 

models, the same geometry was tested without laminated 

materials, i.e. the iron core forms a massive block. Thus, 

2D analysis does not require the lamination formulae and 

therefore should give the same results as 3D analysis. The 

corresponding impedance is shown in Fig. 6. 2D and 3D 

impedances can be found in very good agreement. 

Z / Ω 

10 0 

10 -7 

9% 

10 1 

Figure 5: Laminated iron with . Comparison of conductor 

impedance extracted from field simulations and relative error between 

2D and 3D results. 

Z / Ω 

10 -1 

10 -2 

10 -3 

10 -4 

10 -5 

10 -6 

10 -3 

10 -4 

10 -5 

10 -6 

10 0 

10 -7 

0.00% 

53% 

0.02% 

10 1 

Figure 6: Massive iron with . Comparison of conductor 

impedance extracted from field simulations and relative error between 

2D and 3D results. 

V. COMMON MODE INPUT IMPEDANCE 

A. Assembling of transmission line model 

The state of a multi-conductor line in the frequency 

domain is described by telegrapher’s equations. Taking 

into account the lumped parameter approximation, three 

matrix equations are obtained 

 

Z 1,1 laminated μ r,Fe = 1000 

94% 

10 2 

10 2 

97% 

10 3 

f / Hz 

10 3 

f / Hz 

 

 

 

 

where , , and are the voltage and 

current vectors at the front (z=0) and rear end (z=l) of the 

95% 

10 4 

Z 1,1 massive iron μ Fe = 1000 

0.13% 

1.82% 

6.43% 

10 4 

88% 

10 5 

2D 

3D 

9.58% 

10 5 

2D 

3D 

80% 

6.04% 

10 6 

10 6

motor, respectively, and and 

 

 

are the lumped element matrices in the pi-equivalent 

circuit (Fig. 1). The vector holds the currents through 

the impedances and can be eliminated by inserting one 

equation into the others. In the next step, the winding 

scheme is taken into account which further reduces the 

degrees of freedom in the system (9a-c). The proceeding 

can be found in [3]. It is important to set the current at the 

star point of the machine to zero, according to the 

common mode measuring setup, where the star point is 

not grounded. When solving the final equation system, a 

given voltage at the motor terminals will yield a certain 

input current. The ratio of the two quantities is the 

Common Mode Input Impedance . B. 240 kW induction motor 

The 2D and 3D plots (Fig. 7) show very good 

agreement, despite the deviations in the impedance 

matrices (Fig. 5). This is understandable since the stray 

capacitances are dominant in the common mode circuit of 

the machine. However, impedances are important to 

predict resonance points. The frequency of the first 

resonance in the 2D curve is shifted by 6 kHz while its 

magnitude differs by 4%. 

Z / Ω 

10 3 

10 2 

10 1 

10 3 

first resonance 

| Z com | 2D vs. 3D 

10 4 

f / Hz 

Figure 7: Comparison of the Common Mode Impedance between 2D 

and 3D results. 


We developed a specialized 3D FE simulation code 

which is able to efficiently extract the impedance matrix 

of a multi-conductor setup, e.g. a motor cross section. 

Since the laminated iron is modelled with its actual 

geometry and material properties, 3D results are more 

accurate than those from 2D simulations with 

homogenized material approach. Finally, capacitance 

matrix, impedance matrix and the winding scheme are 

combined to obtain the frequency-dependent Common 

Mode Input Impedance of the machine. It can be 

observed that the solution is dominated by the 

10 5 

2D 

3D 

10 6 


capacitances and, therefore, less sensitive to inaccuracies 

in the impedance matrix. In future work the nonlinear 

properties of the iron will be considered in the 

simulations and the end regions of the motor will be 

included in the Transmission Line Model. 

Acknowledgement This work is founded by the 

Deutsche Forschungsgemeinschaft (DFG) under the 

collaborative research group grant FOR 575. 

[1] 

REFERENCES 

S. Chen, T.A. Lipo, D. Fitzgerald, “Source of induction motor 

bearing currents caused by PWM inverters”, IEEE Trans. on 

Energy Conv., Vol. 11, Iss. 1, 1996, pp. 25–32. 

[2] I. Boldea, S. A. Nasar, “The induction machine handbook”, CRC 

Press, 2002. 

[3] H. Jorks, E. Gjonaj, T. Weiland, O. Magdun, “Three-dimensional 

simulations of an induction motor including eddy current effects in 

core laminations”, IET Science, Measurement & Technology, Vol. 

6, Iss. 5, Sep. 2012, pp. 344 – 349. 

[4] P. Maeki-Ontto, J. Luomi, “Induction motor model for the analysis 

of capacitive and induced shaft voltages”, Proc. of IEMDC '05, 

May 2005, pp. 1653-1660. 

[5] R.L. Stoll, “The analysis of eddy currents”, Oxford University 

Press, 1974. 

[6] FEMM by David Meeker, version 4.2, available at 

[7] 

www.femm.info. 

A. Bossavit, "Two dual formulations of the 3-D eddy-current 

problem", COMPEL, Vol. 4, Iss. 2, 1985, pp. 103 – 116. 

[8] H. De Gersem, O. Henze, T. Weiland, A. Binder, “Simulation of 

wave propagation effects in machine windings”, COMPEL, Vol. 

29, Iss. 1, 2010, pp. 23 – 38.


Computation of end-winding inductances of rotating 

electrical machinery through three-dimensional 

magnetostatic integral FEM formulation 

F. Calvano 1 , G. Dal Mut 2 , F. Ferraioli 2 , A. Formisano 3 , F. Marignetti 4 , 

R. Martone 3 , G. Rubinacci 1 , A. Tamburrino 4 and S. Ventre 4 

1 Dip. di Ingegneria Elettrica, Università di Napoli Federico II, Via Claudio 25, I-80124, Naples, Italy 

2 Ansaldo Energia, Via N. Lorenzi 8, I-16152, Genova, Italy 

3 Dip. di Ing. Industriale e dell’Informazione, Seconda Università di Napoli, Via Roma 29, I-81031, Aversa (CE), Italy 

4 Dip. di Ing. Elettrica e dell’Informazione, Univ. di Cassino e del Lazio Merid., Via Di Biasio 43, I-03043, Cassino, 

Italy 

Abstract—An effective numerical technique to calculate end-winding inductances of rotating electrical machinery is presented. 

The algorithm is based on a 3D integral formulation; it allows to take into account non linearities, relative speed between 

stator and rotor and is well suited for the treatment of regions with large air volumes. Numerical implementation concerning 

the analysis of a large synchronous generator highlights the advantages of the proposed method. The aim of the paper is to 

assess, by means of an accurate 3D model, the correction factor to be applied to the inductances computed through 2D models 

to take into account the effects due to end windings. 

Index Terms— End windings, Integral FEM approach, Inductances, Flux density numerical computation, Synchronous 

generators. 


Although the computation of the inductances of 

rotating electrical machinery is a key issue both in the 

design stage and in performance assessment, its accurate 

calculation by Finite Elements approaches is still an open 

problem. 

The inductances can be split into two contributions: 

main and leakage inductances. 

The end-winding effect affects both the contributions: 

main and, especially, leakage inductances. End-winding 

inductances are at the base of both the steady-state 

operation and the dynamical behavior of electrical 

machinery [1-3]. The main inductances are related to the 

flux linkages while the leakage inductances can be 

divided into slot inductances, tooth tip inductances, 

skewing inductances and zigzag leakage inductances. In 

large machines, end windings contribute significantly to 

the values of both main and more significantly leakage 

inductances [4]. 

The contributions to the inductances due to the active 

length of the conductor can easily be computed either 

numerically, by standard 2D FEM analyses, or 

analytically by means of infinite length models. On the 

contrary, the end winding contribution can only be 

computed from the actual 3D magnetic field distribution. 

Most analytical approaches to the magnetic field 

computation [5, 6], including the most recent ones [7], are 

based on the solution of the Biot-Savart law through 

equivalent current sheet representations, using 

axisymmetric hypothesis [8] or the theory of images [9]. 

Both three dimensional and two-dimensional 

techniques based on FEM can be used as an alternative, 

but they also rely on rough approximations to reduce the 

number of nodes [10,11]. End regions fields and fluxes 

can be very complex to be computed, especially for large 

power machines, where end regions may occupy up to 

one third of the total machine length. 

The aim of the paper is to propose a method based on 

the use of an accurate 3D FEM simulation to improve the 

accuracy of the standard 2D model achieved via a 

commercial software. 

The 3D FEM technique here proposed takes advantage 

from an integral formulation implemented in an noncommercial 

code. Such an approach has been previously 

applied to compute end winding forces [12,13]. 

In this paper the analysis of a large synchronous 

generator is considered as an example. Field simulations 

in different working conditions are used to assess the 

influence of end effects on flux linkages. 

The main advantages of the proposed approach are 

manifold: (1) a reduced number of elements is required to 

model the end regions; because the integral formulations 

do not require the discretization of the air region but, 

rather, of the material regions only (conducting and/or 

magnetic materials); (2) neighboring elements do not 

need to share nodes allowing for more freedom in 

meshing complex geometries; (3) the formulation 

provides an inherent capability to include air-spaced 

moving parts, as no interface mesh is needed [13]. 

This approach is therefore particularly effective to 

model generator end regions because it can also take into

account rotor motion and magnetic nonlinearities. 

The paper is organized as follows: Section II presents 

the basis of the integral formulation and its numerical 

implementation. Flux expressions coming from integral 

formulation are also discussed. Then, the 3D correction 

with respect to 2D fluxes and inductances is introduced in 

Section III. 

Such formulation is used in Section IV to look for the 

3D flux density in the end regions of a large synchronous 

generator. The 3D correction to the 2D quantities are then 

performed according to the proposed formulation. 

Section IV contributes in particular to extend the 

knowledge of rotating electrical machinery by providing a 

powerful numerical tool to compute lumped parameters in 

the analytical model of synchronous generators with a 

precision superior to that achieved by classical 2D finite 

element models. As matter of fact, with the proposed 

model, the contribution to terminal quantities such as the 

reactances coming from the end winding region can be 

accurately taken into account. Terminal quantities are 

finally compared with experimental measurements. 

II. INTEGRAL FORMULATION AND ITS NUMERICAL 

IMPLEMENTATION 

It is well known that for a synchronous machine 

operating at steady state for the computation of the 

inductances it is sufficient to refer to a nonlinear 

magnetostatic model [16, 17]. The currents in the stator 

and in the rotor coils are supposed to be assigned for any 

position of the rotor in order to focus the attention on the 

magnetostatic problem formulation. However for assigned 

voltage, active and reactive power field currents and 

inductances can be computed by solving a sequence of 

non linear mangetostatic problems [17]. In any case it is 

possible to neglect the effects of the eddy currents in the 

massive conductive parts of the device. 

The numerical model is based on an integral 

formulation of the nonlinear magnetostatic problem in 

terms of the unknown magnetization M. The solution is 

obtained by means of a Picard-Banach iteration whose 

convergence can be theoretically proved when the 

magnetic constitutive equation is uniformly monotonic 

and Lipschitzian [14, 15]. 

In particular, by using the Biot-Savart law, the 

magnetic induction can be expressed in terms of its 

sources as: 

( ) = 

( ) = 

ˆ S ( ) + 

0 ( ) 

μ0 

( 

( r−r') ) 3 

Br Mr B r 

μ Mr − ∇⋅ 'Mr' 4π Vf 

r−r' dV '+ 

μ 

( r−r') + ( ') ⋅ ˆ ( ') dS ', for∈V 

3 

f 

4 Mr nr 

r 

r−r' 0 

π ∂Vf 


(3.1) 

where BS is the magnetic induction produced in the free 

space by the stator and rotor currents, Vf is the region 

filled by the magnetic materials, ∂Vf is its boundary and 

ˆn is the outward unit normal on ∂Vf. 

The nonlinear constitutive relationship in Vf can be 

expressed by introducing the local operator as 

M(r)=[B(r)] in Vf 

(3.2) 

Therefore M is the solution of the following nonlinear 

problem: 

M(r)=[M] in Vf 

(3.3) 

As shown in [14], the operator is a contraction if is 

uniformly monotonic and Lipschitzian. Therefore, the 

solution exists, is unique and can be found by the fixed 

point iteration. 

From the numerical point of view, the magnetization 

can be expressed in terms of piece-wise constant vector 

shape functions in each elementary volume arising the 

after discretization of Vf, such as 

M 

() r = M jP 

j () r in V f 

j 

(3.4) 

where the Pj's are discontinuous shape functions obtained 

multiplying the scalar pulse functions pk's (pk = 1 in the kth 

element and it is zero elsewhere) by the (three) unit 

vectors along the coordinate axes. 

The numerical model is obtained by applying the 

Galerkin method to (3.3), rewritten as -1 [M]=[M] in 

Vf: 

-1 

Pi [ M] Pi [ 

M] 

⋅ dV = ⋅ dV, ∀i 

Vf Vf 

The fixed point iteration is therefore rewritten as: 

 

V f 

k+ 

1 

k 

[ M ] dV Pi 

⋅ [ M ] 

-1 

Pi ⋅ 

 

V f 

V f 

 

P ⋅ P dV 

i 

i 

= 

 

V f 

P ⋅ P dV 

i 

i 

dV 

, ∀i 

(3.5) 

(3.6) 

where the subscript k indicates the approximation of M 

and B at the k-th iteration. Note that being the term 

the volume of the i-th element, the r.h.s of 

P ⋅ P dV 

 

V f 

i j 

(3.6) is the average of the magnetic induction B k =[M k ] 

in the i-th elementary volume at the iteration k. Being the 

magnetization piece-wise constant, eq. (3.6) can be solved 

for M k+1 in each element, by applying the constitutive 

relation to the average magnetic induction in the same 

element. 

Therefore, after discretization, (3.6) gives rise to the 

following fixed point iteration [14, 15]: 

( ) 

k −1 

k 

B = D EM + W 

(3.7)

where: 

( ) 

k+ 1 

k 

M = G B 

(3.8) 

( ) ⋅ ( ) ( ) ⋅ ( ) 

μ ˆ ˆ ' ' 

0 nr Pi r nr Pj r 

Eij = μ0Dij− 

 

dSdS' 

4π 

r−r' ∂Vi ∂Vj 

 

Dij 

= Pi⋅PjdV (3.9) 

Vf 

 

Wi 

= Pi⋅BSdV Vf 

Vi is the volume of the i-th element, M k the column 

vector of the coefficients in (3.4) at the k-th iteration, B k 

the column vector made by the average magnetic 

induction in the elements and G is the global relationship 

corresponding to after the discretization process. 

The flux Φn linked with the n-th circuit of volume τ and 

produced by the currents flowing in a set of coils is 

defined as: 

 

Φ n = Ar () ⋅Jn() 

r dτ 

τ n 

(3.10) 

where A is the magnetic vector potential associated to all 

the sources and Jn is the current density associated to the 

unit current impressed in the n-th circuit. 

This definition is consistent with the definition of the 

magnetic energy in the linear case and with the definition 

of the flux linked with a circuit of infinitely small crosssection. 

As a matter of fact, in this case it results: 

Jn() r 

Φ n = Ar () ⋅ dτ 

= 

I 

τ n 

n 

In 

1 

= () ˆ 

Ar ⋅ tSd 

n = 

S 

n 

n I 

γ 

n 

= Ar () ⋅tˆd 

 

γ n 

(3.11) 

being γn the closed curve defining the axis of the circuit, 

ˆt the unit tangent vector and In the unit current flowing 

in the n-th filamentary circuit. 

The magnetic vector potential appearing in (3.11) can 

be calculated from both free and (magnetic) polarization 

currents after (3.3) is solved with suitable boundary 

conditions by applying the Biot-Savart law for the 

magnetic vector potential: 

( ) = ˆ ( ) 

Ar r 

μ 

 

( ', t) 

× ( − ') 

Mr r r 

0 AS+ dV', for 

r∈V 

3 

f 

4π V r−r' f 

(3.12) 

In (3.12) A has been written as the sum of the 

contribution of the free and magnetizing currents. All the 

procedure is quite time consuming when high number of 

unknowns are treated; however a very effective 

computational tool has been recently proposed [18] based 

on a suitable use of high performance computing 


architecture. 

III. THE 3D CORRECTION TO THE 2D SOLUTION 

In the case here treated the 2D solution provides an 

accurate solution in a large part of the domain of interest. 

As a consequence, the 3D analysis can be limited just to 

the region where it is really required. 

The classical theory [19], of the electrical machinery 

suggests to split fluxes and inductances in a 2D and 3D 

contributions, each corresponding to one of the two 

geometrical parts of the system geometry. 

Unfortunately such a separation is rather questionable 

and ambiguous. Then here a quite different approach is 

suggested: the actual 3D magnetic flux linked with a close 

line, is split in (a) the 2D part Φn (2D) (evaluated by 2D 

flux per the unit length, in the axial symmetrical region, 

multiplied for the length) and (b) the complement ΔΦn (3D) 

defined as the 3D correction requested for the case at 

hand. Similar consideration can be applied to other 

parameter including the main or flux leakage coefficients. 

IV. NUMERICAL EXAMPLE 

Among the rotating electrical machinery, large turbogenerators 

are endowed with quite long end windings 

which contribute poorly to produce linkage flux but, 

unfortunately, to produce significant leakage flux [20]. 

Such a contribution is generally evaluated by simplifying 

considerably the complex geometry and, in addition, by 

neglecting the non linearity’s [7]. 

The integral approach presented in the Section II is a 

powerful tool able provide a deeper analysis of the 

machinery and to overcome both limitations. This kind of 

information is particularly relevant in the design process 

of the turbine-generator where the flux leakage has a 

preeminent significance. 

In the following a numerical example is presented in 

order to assess the consistency of the 3D corrections to be 

considered for different operating conditions. 

The turbine generator simulated has a rated apparent 

power in the range 300-350 MVA depending on the room 

temperature affecting the cooling system. It has two poles 

and the nominal frequency is 50 Hz. 

Some details of the finite element mesh of the iron 

regions denoted (Vf ) as well as of the field and armature 

coils, are reported in Fig.1. 

The 3D FEM model is characterized by a number of 

45593 nodes (24201 in the iron region Vf including stator 

and rotor iron as well as the enclosure) and 17774 

elements (12478 in Vf ). It is worth noticing that the a non 

conformal mesh of the iron regions Vf has been adopted in 

order to exploit some additional geometrical degrees of 

freedom in the sub-regions where a particular refinement 

of the mesh is necessary.

Armature coil 

Field coil 

Stator 

iron 

Rotor 

iron 

Enclosure 

Figure 1: Section of the Finite element Mesh used in the 

computations. 

Boundary conditions. In principle the complete 

geometry of the machine should be treated. However just 

a part has been considered while the effect of the 

remaining part has been forced by suitable boundary 

conditions in a cutting plane in the region where the 3D 

solution actually meets the 2D approximation. 

Material characterization. The magnetic iron properties 

has been represented in the (3.2) form by substituting the 

BH curve in B=μ0(H+M). 

In the following the no load operation mode is 

considered: the rotor is assumed to rotate at the nominal 

angular speed, a DC current is applied to the field coil 

and an open circuit is imposed to the armature terminals. 

Notice that, for a synchronous machine such an 

operative condition can be examined by means of just a 

single magneto-static problem. As a matter of fact, such 

solution provides as many samples of the time evolution 

of the terminal voltage as the number of the stator slot if 

the stator winding is a double layer one [16]. 

Of course the non-linear iron effects affects the main 

flux and, consequently, the output voltages. Therefore the 

magnetic analysis has been repeated for several rotor 

currents, including, 100, 500 and 1000A, respectively. 

In particular the field current 500 A corresponds to the 

rated voltage at the generator terminals. In the following, 

the 500 A case is discussed in details while the other two 

currents are considered just to evaluate the saturation 

effect on ΔΦn (3D) . 

The three-dimensional distribution of the magnetic 

vector potential on the field and armature coils are 

sketched in figs. 2, 3 and the flux density in figs. 4, 5, 

assuming the boundary condition imposed on the 

symmetry plane and an excitation current of 500 A. 

Flux waveforms as well as their amplitude spectrum is 

then calculated according to (3.11); the results are 

reported in fig. 6. In order to look for higher harmonics, a 

spectrum analysis of the principal flux has been 

performed (fig. 7). 


Figure. 2: amplitude of the magnetic vector potential 

[Tm] in the armature coil. 

Figure. 3: amplitude of the magnetic vector potential 

[Tm] in the field coil. 

Figure. 4: amplitude of the magnetic induction [T] in the 

armature coil. 

Figure. 5: amplitude of the magnetic induction [T] in the 

field coil.

As mentioned before, the 2D field coincides with the 

3D field in the neighborhood of the symmetry plane. The 

analysis of the vector potential and flux density 

components of the 3D model can be compared to the 2D 

solution, to evaluate the validity of the 2D approximation. 

The 2D solution matches the 3D one in a large part of the 

active length, with a good approximation (in the order of 

90%). 

The comparison of magnetic flux, inductance 

coefficient per unit length of both solutions provides the 

desired correction factors. The flux waveforms as well as 

the amplitude spectrum from the 2D solution are reported 

in figs. 6, 7 and the results are compared with those from 

3D solution. 

Figure. 6: Flux linkage [Wb] waveforms of a single a 

stator coil 

From the comparison of the main fluxes it follows the 

2D solution underestimates the flux as well as the no load 

voltage of ΔΦn (3D) =2%. 

Unfortunately, due to its complexity, both the accuracy 

and the resolution of the 3D solution could be 

unsatisfactory for a number of applications. Of course, the 

2D analysis is able to provide more accurate and robust 

solution in the plane. Therefore, in order to assess the 

quality of the 2D solution given by the 3D analysis, a 2D 

analysis of linkage fluxes has been carried out by using 

the commercial software package Ansys (Release 13) and 

the voltage computed from the principal flux has been 

compared with both the 3D evaluations and the 

experimental measurements. 

The finite element mesh used in this case is shown in 

fig. 8 and consists of 125549 nodes and 5754 second 

order triangular elements. 

The no load voltage ha been computed and compared 

to the 3D calculations. The actual end-winding effect, 

neglected by the 2D solution, is highlighted in fig. 9 

where the air gap radial magnetic induction as a function 

of the angle θ and of the axial position Z is plotted. In 

addition, to further assess the analysis the 2D no load 

voltage has been also compared with a set of experimental 

measurements. 

The discrepancy is rather vanishing (below 1%). Of 


course such a result comes by an equilibrium of two 

conflicting effects: the first is lack of the end winding 

contribution and the second the error introduced by 

neglecting the 3D effects in the 2D evaluations. 

Figure. 7: Flux linkage [Wb] amplitude spectrum. 

Fig. 8: 2D finite element mesh used in the 2D 

calculations. 

Figure. 9: 3D rendering of the radial magnetic flux at the 

air gap.

Similar results can be achieved with different currents 

(discrepancy in the order 2-3% of the actual flux linkage 

with 100 A or 1000 A) showing that, in no load operation 

the effect of iron non linearity is quite limited. 

The same procedure is applied to evaluate the leakage 

flux and its contribution coming from 3D effects. The 

results show that the discrepancy is much more relevant. 

In the order of 10, 15, 20 % for an exciting current of 

100, 500 and 1000 A, respectively. 

V. CONCLUSION 

The computation of end winding inductances of large 

turbo generators requires proper mathematical tools to be 

performed. This paper proposes an integral FEM 

formulation to compute the 3D vector potential and flux 

density distribution in the end regions. The approach does 

not requires the meshing of the free space than allowing a 

significant reduction of computer burden. 

A large synchronous generator with power in the range 

300-350 MVA has been analyzed as a case study. Both 

axial and radial components of the flux density generated 

by stator coils have been computed. 

The flux linkages for all coils and the no load voltages 

have been computed. 

In order to assess the influence of the end regions, 

different stack lengths have been simulated for the same 

end windings length. The results achieved include the 

definition of 3D correction to be applied to 2D 

simulation and, in addition, the variation of the correction 

as a function of the exciting currents has been evaluated. 

The comparison between terminal quantities coming from 

widely overspread 2D models and experimental 

measurements have been reviewed by considering 3D 

effects evaluated by using the proposed model. 

REFERENCES 

[1] B. Hosninger, “Theory of end-winding leakage reactance”, Power 

Apparatus and Systems, Part III, vol. 78, pp. 417-426, Aug. 1959. 

[2] W. M. Arshad, H. Lendenmann, Y. Liu, J.-O. Lamell and H. 

Persson, “Finding end winding inductances of MVA machines” 

Proc. Fortieth IAS Meeting, vol. 4, pp. 2309-2314, 2005. 

[3] M.F. Hsieh, Y.C. Hsu, D.G. Dorrell, and K.H. Hu, "Investigation 

on end winding inductance in motor stator windings", IEEE 

Trans. on Magn., vol. 43, pp. 2513–2515, June 2007. 

[4] J. Pyrhönen, T. Jokinen and V. Hrabovcová, Design of Rotating 

Electrical Machines, John Wiley and Sons, Ltd, 2008. 

[5] J. A. Tegopoulos, "End component of armature leakage reactance 

of turbine generators", IEEE Trans. on PAS, vol. 83, pp. 632-637, 

June 1964. 

[6] J. A. Tegopoulos, “Current sheets equivalent to the end-winding 

currents of turbine generator stator and rotor,” AIEE Trans. Pt. III, 

vol. PAS-81, pp. 695–700, February 1963. 

[7] V.S Lazarns, A.G Kladas, A.G Mamalis, and J.A. Tegopoulos, 

"Analysis of end zone magnetic field in generators and shield 

optimization for force reduction on end windings", IEEE Trans. 

on Mag., vol. 45, pp.1470–1473, March 2009. 


[8] D. J. Scott, S. J. Salon, and G. L. Kusik, “Electromagnetic forces 

on the armature end windings of large turbine generators I— 

Steady state conditions”, IEEE Trans. PAS., vol. PAS-100, pp. 

4597–4603, Nov. 1981. 

[9] Q. Li and F. Wang, “Application of image method to calculate 3- 

D magnetic field and parameters of SC alternator”, IEEE Trans. 

on Mag., vol. 25, pp. 1850–1853, Feb. 1989. 

[10] D. Ban, D. Zarko, and I. Mandic, “Turbo-generator end winding 

leakage inductance calculation using a 3D analytical approach 

based on the solution of Neumann integrals”, IEEE Trans. on En. 

Conv., vol. 20, pp. 98–105, March 2005. 

[11] A.T. Brahimi, A. Foggia, and G. Meunier, "End winding 

reactance computation results using a 3D finite element program" 

IEEE Trans. on Mag., vol. 29, pp. 1411-1414, March 1993. 

[12] R. Albanese, F. Calvano, G. Dal Mut, F. Ferraioli, A. Formisano, 

F. Marignetti, R. Martone, G. Rubinacci, A. Tamburrino and S. 

Ventre, "Coupled three dimensional numerical calculation of 

forces and stresses on the end windings of large turbo generators 

via Integral Formulation", IEEE Trans. on Mag., vol. 48, pp. 875 

- 878, Feb. 2012. 

[13] F. Calvano, G. Dal Mut, F. Ferraioli, A. Formisano, F. Marignetti, 

R. Martone, G. Rubinacci, A. Tamburrino and S. Ventre, “A 

novel technique based on integral formulation to treat the motion 

in the analysis of electric machinery”, International Journal of 

Applied Mathematics and Mechanics, in press. 

[14] R. Albanese, F. I. Hantila, and G. Rubinacci, “A nonlinear eddy 

current integral formulation in terms of a two-components current 

density vector potential”, IEEE Trans. Mag. 32, pp. 784-787, 

March 1996. 

[15] R. Albanese, and G. Rubinacci, “Finite elements methods for the 

solution of 3D eddy current problems”, Advances in Imaging and 

Electron Physics, vol. 102, pp. 1-86, 1998. 

[16] N. Bianchi, Electrical machine analysis using finite elements, 

Taylor and Francys, pp. 141-162, 2005. 

[17] M.V.K. Chari, S.H. Minnich, S.C. Tandon, Z.J. Csendes, J. 

Berkery, “Load characteristics of synchronous generator by the 

finite element method”, IEEE Trans. on PAS, vol. 100, pp.1-13, 

January 1981. 

[18] R. Albanese, F. Calvano, G. Dal Mut, F. Ferraioli, A. Formisano, 

F. Marignetti, R. Martone, G. Rubinacci, A. Tamburrino and S. 

Ventre, “Electromechanical analysis of end windings in turbo 

generators”. COMPEL, vol. 30, pp. 1885-1898, 2011. 

[19] J. Pyrhonen, T. Jokinen, V. Hrabokova, Design of Rotating 

Electrical Machines, John Wiley and sons, Ltd, 2008, pp.246-249 

[20] M.V. Deshpande, Design and Testing of Electrical Machines, 

Phi learning Pvt. Ltd., 2010.


Magnetomechanical Coupled FE Simulations of 

Rotating Electrical Machines 

*A. Belahcen, *K. Fonteyn, † R. Kouhia, *P. Rasilo, and *A. Arkkio 

*Aalto University, Dept. of Electrical Engineering, POBox 13000, FIN-00076 Aalto, Finland 

† Tampere University of Technology, Dept. of Mechanics and Design, P.O BOX 589, 33101 Tampere, Finland 

E-mail: anouar.belahcen@aalto.fi 

Abstract— Regardless of the relatively large amount of published models of magnetostriction, only few of them have been 

applied to describe this phenomenon in electrical steel and even less have been incorporated in the FE simulation of electrical 

machines. In this paper we review the models of magnetostriction and magnetomechanical coupling in electrical steel and their 

incorporation into the FE analysis of rotating electrical machines. We also discuss the advantages and disadvantages of the 

different models and present an energy-based coupled magnetomechanical set of constitutive equations that describe both the 

magnetostriction and the magnetic nonlinearity and its dependency on stresses in the electrical steel. We further present the 

implementation of these equations into in-house 2D FE software for the simulations of electrical machines. The simulations 

carried out show that the energy based model describes well the vibrations of electrical machines due to magnetostriction and 

reluctance forces. A discussion on how the model should be improved to account for hysteresis is also presented. 

Index Terms—coupled models, electrical machines, finite elements, magnetostriction. 

current density and the geometry of the windings are 

known. However, some issues related to the skin-effect 

and the eddy-currents make this computation rather 

complex in some special cases as explained by Islam et al. 

[1]. In iron, and due to its magnetic domain structure and 

its finite electric conductivity, the flow of a time-varying 

flux produces hysteresis and eddy-current losses (in some 

approaches also excess losses). The computation of these 

so-called iron losses is still one of the most active 

research fields in the simulation of electrical machines. 

Finally, the motion of the rotating parts of the machine 

and the friction in the bearings of the machine as well as 

the one between the moving parts and the air produces 

mechanical losses that can be computed through complex 

CFD models in case of high speed machines or 

approximated by semi-empirical equations. 

The knowledge of the above loss components is 

valuable information for the designers of electrical 

machines as they allow them to optimize the structure of 

the machine with regards to the cooling and the 

mechanical strength as well as the use of magnetic and 

other materials. 

Besides the structural and cooling optimization of the 

machine, the designers are bind by environmental aspects 

such as the level of acoustic noise and the estimation of 

the lifecycle of the machine. The acoustic noise is 

produced by the vibrations of the structure of the machine 

under the effect of different forces and other excitations 

and by the airflow in different channels. 


Owing to the high demand on energy efficient and 

environmental friendly apparatuses, the designers of 

electrical machines, among others, are more and more 

interested in accurate computational methods for the 

analysis of their design. The energy conversion in 

electrical machines occurs within three different but 

tightly coupled subsystems, i.e., the electrical system, the 

magnetic system and the mechanical system as shown in 

Fig. 1. The electric system consists of the windings of the 

machine that are connected to the supply in the case of a 

motor or to the load in case of generator operation. The 

electric supply/load is nowadays typically a voltage 

source frequency converter, the current of which is 

controlled according to the operation point of the 

machine. Such current depends on the load of the 

machine and consists of a torque producing component as 

well as a component necessary for the magnetization of 

the iron core and another compensating for the core 

losses. The magnetic system consists of the iron core of 

the machine as well as the airgap and other construction 

parts in which a magnetic flux is produced by the coils’ 

currents according to the Ampere’s law of induction. 

These fluxes and their interaction with the magnetic 

materials produce forces and torque that are transferred to 

the mechanical system consisting of the rotor, the shaft 

and the bearings of the machine as well as a possible 

cooling fan mounted on the shaft of the machine. 

The electric, magnetic and mechanical systems, 

although represented as separate subsystems, are tightly 

coupled to each other and their operation quantities 

cannot be solved separately. Indeed the current drawn by 

the machine, the torque it produces and the magnetic flux 

density in the airgap and the iron core are usually solved 

simultaneously especially if the machine is voltage fed. 

The operation of the above subsystems is known to be 

dissipative as there are energy losses related to each 

subsystem. The Joule losses resulting from the currents 

flowing in the windings are usually easy to compute if the 

Electrical power 

Lorentz forces 

Electrical 

System (windings) 

Electrical 

Losses 

Magnetic forces and 

Magnetostriction 

Coupling field 

(iron and air) 

Air flow and Friction 

Vibrations Wearing and Noise 

Magnetic 

Losses 

Mechanical system 

(bearings and fan) 

Mechanical 

Losses 

Mechanical power 

Fig. 1. Illustration of the energy conversion process with the 

related electric, magnetic and mechanical subsystems and 

related losses, forces and vibrations and their origins.

The interaction between the flux and the currents 

produces Lorentz forces acting on the windings, while the 

flow of the magnetic flux in the iron core and the airgap 

gives rise to magnetic forces and strains in the core. 

The vibrations produced by the interaction between the 

magnetic forces, the deformations and the structure result 

in acoustic noise and mechanical wearing of different 

parts of the machine such as the winding insulation and 

the bearings. The knowledge of these parasitic effects at 

the design stage will help in reducing the vibrations and 

noise of the machine as well as estimating the lifecycle of 

the machine and optimizing the structure for longer life 

too. An illustration of the different losses and vibration 

phenomena occurring at different subsystems of the 

energy conversion is given in Fig. 1. 

The strains and stresses in the iron core of an electrical 

machine are produced by different sources and have 

strong degrading effect on the quality of the iron, thus 

reducing the efficiency of the energy conversion process 

and increasing the amount of iron needed for a given 

power of the machine. The effect of the mechanical stress 

on the power losses in electrical steel have been known 

for quite long time. Already in the 70’s of the last century 

Moses [2], [3], among others, showed through magnetic 

measurements that the mechanical stress affects the 

losses, the magnetization, and the magnetostriction of 

electrical steel. By magnetostriction it was meant the 

relative change in the length of a specimen of magnetic 

material when it is subjected to a magnetic field. 

For a better understanding let us first clarify what is 

magnetostriction. In 1842 W. P. Joule discovered what is 

today called Joule magnetostriction, which is a volume 

conserving deformation of magnetic material caused by 

its magnetization. Such a deformation results in an 

elongation of the material in the direction of 

magnetization and a shrink in the orthogonal directions 

for positive magnetostrictive materials and vice versa for 

negative magnetostrictive materials. Later on, it was 

observed that at high values of the magnetization the 

deformation is no more volume conserving and this 

phenomena was called volume magnetostriction. Both 

types of magnetostriction are called forced 

magnetostriction in a sense that they are caused by the 

magnetization that forces the magnetic domain walls to 

move and the domains to rotate thus producing the 

mechanical deformation. On the other hand, when the 

magnetic material is cooled down from a high 

temperature, it undergoes a strong isotropic change in its 

dimensions as it goes though the Curie temperature. Such 

a change in volume was explained by the formation of 

magnetic domains and the orientation of elementary 

magnetic moments within the domains. Several other 

experimental works have been conducted on magnetic 

materials and separate magnetomechanical phenomena 

obtained separate names according to their respective 

discoverers. The skew magnetostriction ,e.g., resulting 

from a helical magnetization and producing a bending of 

some electrically conducting magnetic material has been 

called Wiedemann-effect and the inverse effect of 


magnetostriction which results in a change of the 

magnetization properties of magnetic materials under the 

action of applied mechanical stress was called Vilari 

effect. Also the apparent change in the Young modulus of 

magnetic material, which is due to the intrinsic 

rearrangement of the magnetic domains and thus the 

intrinsic elongation following this rearrangement, has 

been called Delta-E effect. A comprehensive description 

of these phenomena can be found, e.g., from [4]. The 

intrinsic forms of magnetostriction are nowadays referred 

to as spontaneous magnetostriction. They are of great 

interest for metallurgist but are not of importance for the 

simulation of electrical machines under operation as they 

do not occur anymore at this stage. Fig. 2 shows an 

illustration of the difference between spontaneous and 

forced magnetostriction. 

Cooling through 

Curie temperature 

H 

Positive Joule 

magnetostriction 

H=0 

Applying external 

magnetic field 

Spontaneous 

Negative Joule 


H 

Forced magnetostriction 


Fig. 2. Illustration of spontaneous and forced magnetostriction. 

The mechanical stress or strain acting on the magnetic 

material can originate from different phenomena some of 

them produce static stresses and other dynamic stresses. 

The shrink fitting of the stator into the frame of the 

machine produces static compressive stresses of the order 

of 10 MPa as shown by Fujisaki et al. [5]. These stresses 

can be evaluated by means of numerical simulations or 

through analytical approximations, but due to the 

manufacturing tolerances they may have excessive local 

values. On the other hand the so called reluctance forces 

occurring between the stator and the rotor of the machine 

and which are time and space dependent, produce 

dynamic tensile and compressive stresses at different 

locations of the machine. These stresses are at the origin 

of the so called magnetic noise, i.e., the acoustic noise 

due to magnetically excited vibrations. The level of these 

stresses depends on the magnetic flux density in the air 

gape of the machine and can be of the order of 200 MPa. 

Further, the rotating and alternating magnetization of the 

iron core produces dynamic magnetostrictive strains the 

level of which depends on the state of stress in the core. 

At last but not least, the punching of the magnetic 

material in the manufacturing process produces residual 

stresses and plastic strains at some regions of the 

magnetic material. The level of these stresses and strains 

depend the manufacturing process and the quality of 

punching tools. 

All these stresses and strains will affect both the 

magnetization characteristics and the energy losses of the 

magnetic material as well as the vibrations of the core and 

thus the acoustic noise and the wearing of the materials 

and parts of the machine. 

The modeling of magnetostriction started In the 50’s of 

the last century as related to the so called giant 

magnetostrictive materials and their applications in 

ultrasound generators and receptors. The effect of

magnetostriction on the noise of power transformer was 

also investigated through experimental studies starting in 

the 70’s by Moses [6] and later by Weiser and Pfutzner 

[7] but the modeling of such phenomena in electrical 

machines started only at the beginning of the last two 

decades as it was noted that the prediction of the losses 

and vibrations of the machine requires adequate models 

able to account for the magnetomechanical coupling as 

well as the other electromagnetic couplings. 

In this paper we will concentrate on the modeling of 

magnetostriction and the related magnetomechaical 

effects and their incorporation into the simulation of 

electrical machines as well as the effect of these 

phenomena on the computation of iron losses. The 

coupled electromagnetic simulation methodology for 

electrical machines, which made it possible to develop the 

magnetomechanical models is now quite established as 

reported by Arkkio [8] and Salon et al. [9] among others 

and will not be handled here. 

In section II we will explore the force-based models of 

magnetostriction and in section III the stress-strain based 

models. In section IV we will introduce the energy based 

model and in section V we will discuss the incorporation 

of the different models into 2D finite element simulation 

of electrical machines. In Section VI we will discuss the 

impact of these models on the computation of iron losses 

and sketch the necessity for future developments in view 

of accurate simulations and computation of losses and 

vibrations. 

II. FORCE-BASED MODELS OF MAGNETOSTRICTION 

The main idea behind the force-based models of 

magnetostriction is that the magnetostrictive deformation 

of a sample of magnetic material under homogeneous 

magnetization can be produced by a distributed set of 

forces acting on the boundaries of the sample as 

explained by Delaere et al. [10] and sketched in Fig. 3 for 

an arbitrary element. 

Such a representation of the magnetostriction emanates 

from earlier work of Besbes et al. [11], where the 

magnetostrictive forces were directly derived from the 

principle of virtual work and by accounting for the 

variation of the permeability with the mechanical stress. It 

should be mentioned that in [11], the magnetostriction has 

not been well described as sever assumptions such as 

linear magnetization and its linear dependency on the 

stress have been made. The local application of the 

principle of virtual work for the computation of magnetic 

forces itself was developed by Bossavite [12] after its 

introduction at a global level by Coulomb [13]. 

The development of such force-based models and their 

coupling with the electromagnetic simulation of electrical 

machines was also reported by Mohamed et al. [14], 

Vandevelde et al. [15] and Belahcen [16]. Vandevelde 

and Belahcen used a stress approach to compute the 

magnetostrictive forces, whereas Delayer used a strain 

approach. In all these works, the other magnetic forces 

were computed either according to the principle of virtual 


work and introduced into the FE simulation as 

generalized nodal forces or through the Maxwell stress 

tensor. The two methodologies have been earlier 

demonstrated by Kameari [17] to be equivalent. Lately, 

many authors applied the concept of magnetostrictive 

forces in the FE simulation of electrical machines. 

The equation for the computation of the generalized 

nodal magnetic forces is given bellow 

B 

T 1 

F A A d dSˆ 

J e 

Sˆ 

e 

H 

B J 

(1) 

 

e U U 

0 

 

where B and H are the magnetic flux distribution and 

field strength respectively. J is the Jacobian matrix for the 

transformation from the reference finite element to actual 

one and ˆ Se stands for the reference element. U is the 

vector of nodal displacement. The integration with respect 

to the magnetic flux density in (1) as well as the 

differentiation with respect to the displacements is carried 

out analytically. For this purpose and for FE computation, 

a cubic spline representation of the HB-curve of the iron 

sheets is used. The different approaches for the 

computation of nodal magnetostrictive forces can be 

found in [10], [14], [15], and [16]. Fig. 4 shows the 

magnetic and magnetostrictive force computed for the 

stator core of a synchronous machine [18]. Similar forces 

for the induction machines have been reported in [19]. 

Fig. 3. Sketch of the computation of magnetostrictive force from 

magnetostrictive deformation after Delaere [10]. 

Fig. 4. Generalized magnetic forces (left) and equivalent 

magnetostrictive forces computed in the stator core of a 

synchronous machine. The forces have been normalized. 

The structural deformation can be computed either in a 

coupled or uncoupled methodology. In the coupled 

approach the mechanical and magnetic problems are 

solved simultaneously and the forces are updated at 

iteration level. In the uncoupled approach the magnetic 

problem is first solved and the forces computed as postprocessing 

quantities from the magnetic problem then 

introduced in the mechanical problem as loads. The 

results of the mechanical problem are the nodal 

displacements from which the deformation as well as the 

strains and stresses can be computed. 

Although the concept of magnetostrictive forces 

describes quit well the deformation of the magnetic

material it has a major drawback that consist of resulting 

in erroneous stress state in the material. Indeed, the 

magnetostrictive forces result into tensile stress if the 

boundaries of the element are free to move (refer to Fig. 

3). However, the actual state of stress in this case should 

be a zero stress in the element. If the boundaries of the 

element are fixed, the magnetostrictive forces will result 

into a zero stress, while the actual stress is a compressive 

one, the magnitude of which depends on the material 

properties and the level of magnetization. This erroneous 

behavior due to equivalent magnetostriction forces is 

illustrated in Fig. 5, where the magnetostrictive 

elongation is solved correctly but the stress is wrong. 

From the above discussion it is clear that the concept of 

magnetostrictive force is not able to describe the stress 

dependency of the magnetic properties of the material and 

thus of the magnetostriction itself when it has to be 

computed under different boundary conditions dictated by 

the geometry and the topology of the electrical machine. 

III. STRAIN-BASED MODELS OF MAGNETOSTRICTION 

In the strain-based models of magnetostriction there is 

no need for the calculation of equivalent forces. The 

magnetostrictive strains are incorporated in the structural 

analysis in a similar way to the thermal dilatation of 

metals. Here also, the magnetostriction can be modeled in 

a decoupled or coupled approach. In the decoupled 

approach [20] the magnetostrictive strains are computed 

from per element magnetic flux densities and the 

measured single valued flux densty-elongation 

relationship. These strains are then incorporated in a 

structural finite element model of the electrical machine 

to produce the deformation. If the effect of stress is to be 

accounted for in the computation of the magnetostriction, 

a coupled model should be used. Such a model emanates 

from the coupled magnetomechanical constitutive 

equations of the material [21] as explained in the next 

Section. 

Equivalent forces: 

Equivalent forces: 

= 0 ; = 0 = ms ; = -ms 

Iron 

Actual behavior: 

Actual behavior: 

= 0 ; = ms = ms ; = 0 

Fig. 5. Illustration of the magnetostriction and the state of 

stress in the sample under the effect of magnetostrictive forces. 

Left clamped sample and right free sample. In both case the 

elongation is correct but the stress is wrong. 

IV. THE ENERGY-BASED CONSTITUTIVE EQUATIONS 

The energy-based, coupled constitutive equations of 

the electrical steel are derived from an appropriate 

representation of the Helmholtz free energy [21], which 

itself is based on previous empirical observations made 

from the measurement of magnetostriction under different 

stresses and flux densities [22]. A summary of the results 

of these measurements all together with the model 

prediction are shown in Fig. 6. The Helmholtz free energy 

in a sample is written as: 

Iron 


1 

I 

 

2 

2 

1 

4 1 

g ( I ) I 

i 1 4 1 1 

 

2 

2 4 5 5 6 6 

2 

 

i0 i 1 

 

B 

2 2 

ref 

2GI 

I I I 

 

where the invariants I .. I are: 

1 6 

1 2 1 3 

I tr( ) 

, I tr( ) , I tr( ) 

1 2 

3 

2 

3 

(3) 

I 

4 B B, 2 

I BB, I BB 5 

6 

(4) 

is the first Lamé parameter,G the shear modulus of the 

material, the mass density and 

3 3 1 

g exp( I ) 0 0 1 0 5 

4 4 3 

(5) 

3( i 1) 4( i 1) 

g exp 

I 

 

; i i 

1 

4 

 

3 

i 1..4(6) 

; i 0..6 are model parameters and B is a reference 

i ref 

flux density. The magnetomecjanically coupled 

constitutive equations are derived from (2) as 

6 

6 

 

Ii 

 

Ii 

and M 

i1, i3I i 

i1, i3IB i 

(7) 

where and are the stress and strain tensors and M the 

magnetization. An extensive derivation of the model and 

its equations is given in [23]. The model results in an 

explicitly coupled formulation for the stress tensor and 

the magnetic field strength vector in terms of the magnetic 

flux density and the mechanical strain tensor as 

( B, 

) I + ( 

I ) 2G 

1 1 4 

 

1 1 

 

( ) 

 

2 ( ) 

0 

B B BB 

B B BB 4 

2 

 

B B( BB) 

5 

2 

2 

 

 

 

B B B B 

 

6 2 

2 ( ) +( ) 

 

B B B B B B 

 

(8) 

1 

5 

2 

HB ( , ) B2 B B B 

0 4 5 

2 2 

(9) 

Magnetostriction (m/m) 

(2) 

In (8) and (9) 1 and are used to shorten the notation: 

4 

x 10-6 

5 

4 

3 

2 

1 

0 

-1 

1 g 

 

 

I and 

4 

i i1 

1 4 

2 

i0 

I1 

3.9 MPa 

0.0 MPa 

-1.7 MPa 

-6.1 MPa 

-2 

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 

Flux density (T) 

 

1 

g 

(10) 

4 

i i 

 

I 

4 4 

2i0i1 Fig. 6. Measured magnetostrictive strains at different flux 

densities and applied mechanical stresses (left) and comparison 

with model prediction (right). Positive stresses are tensile and 

negative ones are compressive. 

V. INCORPORATION INTO FE SIMULATIONS 

The starting point for the implementation of the 

magnetomechanically coupled equation in FE analysis of 

electrical machines is an in-house 2D software package. 

The field equations have been previously coupled with

the electrical circuit equations of the machine winding, 

which makes it possible to feed the model from a voltage 

source and also resolve the induced voltages and currents 

in other parts of the machine, e.g., cage winding or solid 

rotor, through time stepping analysis [8]. 

The coupled constitutive equations (8) and (9) were 

first linearized to the first order and then the weak form of 

Galerkin method was applied to (8) and the principle of 

virtual work applied to (9). This resulted in: 

 

H ( w) 

d H 

( w) 

 

B d 

 

B 

 

 

(11) 

( w) H d wH ds 0 

0 0 

 

 

T ˆ 

 

 

 

d 

 

T 

 

 

ˆ 

 

 

d 

 

 

ˆ d uˆ ( f f ) d 

B 

 

T 

0 

B 

 

 

 

T 

 

T 

uˆf ds 0 

surf 

 

mech inert 


(12) 

where w is a test or weight function and the quantities 

with hat are virtual ones.u is the mechanical displacement 

vector and f , f , f are respectively mechanical, 

mech inert surf 

inertia body forces and surface forces. In the 

implementation the shape functions of the finite element 

approximation are used as weight function. 

Equations (11) and (12) are then spatially descitized 

using standard finite element procedure and inserted in 

the in-house code, thus replacing the nonlinear model of 

iron cores. The insertion of these equations in the code 

does not affected the electromagnetic coupling as this 

latter one takes place in the windings and conducting 

regions only whereas the magnetomechanical coupling 

takes place in non conduction iron. Special attention 

however has to be given to the region formed by the 

airgap or more properly the interface between any noniron 

region and the iron core. This is because the Maxwell 

stress tensor makes sense only if it is computed from both 

regions at any interface. Thus when assembling the 

system matrix, the contribution to the nodal values of the 

Maxwell stress are computed from both iron and non-iron 

element with common interface with the iron core. 

In the case of force based model of magnetostriction, 

the approach is quite similar to the one presented above, 

except that the equivalent magnetostrictive forces as well 

as the other magnetic forces computed with (1) have been 

inserted in the model as external mechanical forces. Such 

implementation was already reported in [19]. 

The implemented formulation and software were first 

tested on a simple model consisting of an iron disc 

excited through Direchlet boundary condition on its outer 

edge. The magnetic vector potential on this edge was set 

to time dependent values as to create a rotating field 

uniformly distributed on the surface of the test sample. 

The boundaries of the sample were free to move and only 

the center of the disc was fixed in both x- and y-direction. 

Fig. 7. Shows the original and deformed mesh used in the 

model when the flux density was 1.5 T either along the xaxis 

or at an angle of 45 deg. to it. The results show that 

thanks to the tensor representation and formulation the 

effect of the shear stress and strain are correctly 

computed. The model was also applied to a induction 

machine-like device without the airgap in view of 

minimizing all the other magnetic forces. The results from 

this verification are reported in [23] and show good 

agreement between the measured and computed 

displacements. The model was applied to the computation 

of the deformation and vibrations of two induction 

machines, the parameters of which are given in Table I. 

The extensive results from the simulations of the two 

machines as well as a comprehensive analysis of these 

results are reported in [24]. Here we present a comparison 

between the computed displacements of nodes on the 

tooth of the machines when the magnetostriction only is 

accounted for and when the so called reluctance forces 

(Maxwell stress) are also accounted for. Although, the 

vibrations depend on the machine construction and could 

not be generalized, this result gives the reader an estimate 

of the effect of magnetostriction on the vibrations of 

rotating electrical machines. Fig. 8 shows this comparison 

for both machines. Due to the differences in the number 

of pole pairs the vibration behaviors are different too. 

y-coordinate (m) 

0.15 

0.1 

0.05 

0 

-0.05 

-0.1 

-0.1 -0.05 0 0.05 0.1 0.15 

x-coordinate (m) 

y-coordinate (m) 

0.15 

0.1 

0.05 

0 

-0.05 

-0.1 

-0.1 -0.05 0 0.05 0.1 0.15 

x-coordinate (m) 

Fig. 7. Test model, original and deformed mesh computed with 

the developed method with a flux density of 1.5 T along the xaxis 

(left) and at 45 deg (right). 

Table I. Parameters of the simulated induction machines 

Parameter Machine I Machine II 

Rated voltage 380 V 380 V 

Slip 2 % 3.2 % 

Rated current 60 A 27 A 

Rated power 30 kW 15 kW 

Number of pole pairs 1 2 

Outer diameter of the stator core 323 mm 235 mm 

Inner diameter of the stator core 

190.2 mm 145 mm 

Number of stator slots 

36 

36 

Outer diameter of the rotor core 188.37 mm 144.1 mm 

Number of rotor slots 28 34 

Fig. 8.Comparison between computed displacements of a 

node on the stator tooth in machine 1 (left) and machine 2 

(right). Subscripts r and stand for the radial and tangential 

directions. 1 is the case with only magnetostriction and 2 when 

both magnetostriction and reluctance forces are considered.

VI. FUTURE TRENDS AND DEVELOPMENTS 

The developed coupled magnetomechanical model 

although bidirectional in magnetic and mechanics is 

single valued and thus does not take dissipation into 

account. On the other hand, existing hysteresis models 

either for magnetic [25] or mechanics [26] are based on 

the Preisach approach, which is mathematically rigorous 

but does not give clear insight into the energetic balance 

between the two subsystems. The only energy-based 

hysteresis model that describes both magnetism and 

mechanics is the one developed by Jiles [27] but its 

application in electric steel and further to the simulation 

of electrical machines has not been reported yet. This 

might be due to the sharp saturation of the magnetization 

curves of electrical steel, which cause convergence 

problems but also to the fact that the original model traits 

the compressive and tensile stresses in a symmetric way, 

which is not adequate for the electric steel where the 

compressive stresses have much pronounced effect on the 

magnetic and magnetostrictive properties of the material. 

The dynamic behavior of magnetostriction also needs to 

be addressed [28] as well as the effect of anisotropy [20]. 

These shortcuts in the modeling and simulation of 

electrical machines still need to be addressed in view of 

better estimation of the vibrations of electrical machines 

and iron losses, which are known to depend on the state 

of stress in the material. The presented model already 

estimates the effect of magnetostriction on the state of 

stress but other causes of stress have to be added too. 

The models to be developed and used need both 

characterization of the material under different flux 

densities and mechanical stress and verification 

procedures to assess the validity of the models. The work 

presented in [29] is a good start for the characterization 

work. We have also developed characterization 

methodologies and analyzed their accuracy [30]; this 

work is still continuing. The verification work needs still 

some development. 

REFERENCES 

[1] M. J. Islam, A. Arkkio, “Effects of pulse-width-modulated supply 

voltage on eddy currents in the form-wound stator winding of a cage 

induction motor,” IET Electric Power Applications, vol. 3, no. 1, pp. 

50-58, January 2009. 

[2] A. Moses, P. Phillips, “Some effects of stress in Goss-oriented siliconiron,” 

IEEE Trans. Magn. , vol. 14, no. 5, pp. 353-355, Sep 1978. 

[3] A. Moses, “Effects of applied stress on the magnetic properties of 

high permeability silicon-iron,” IEEE Trans. Magn., vol. 15, no. 6, pp. 

1575-1579, Nov 1979. 

[4] Du Trémolet de Lacheisserie, E., 1993. Magnetostriction–Theory 

and Applications of Magnetoelasticity. CRC Press Inc. 432 pages. 

[5] K. Fujisaki, R. Hirayama, T. Kawachi, S. Satou, C. Kaidou, M. 

Yabumoto, T. Kubota, “Motor core iron loss analysis evaluating 

shrink fitting and stamping by finite-element method,” IEEE Trans. 

Magn., vol. 43, no. 5, pp.1950-1954, May 2007 

[6] A. Moses, “Measurement of magnetostriction and vibration with 

regard to transformer noise,” IEEE Trans. Magn., vol. 10, no. 2, pp. 

154-156, Jun 1974 

[7] B. Weiser, H. Pfutzner, J. Anger, “Relevance of magnetostriction and 

forces for the generation of audible noise of transformer cores,” IEEE 

Trans. Magn., vol. 36, no. 5, pp.3759-3777, Sep 2000 

[8] A. Arkkio, Analysis of induction motors based on the numerical 

solution of the magnetic field and circuit equations, Doctoral 

dissertation, 1987, Helsinki University of Technology, Espoo, Finland 


[9] S. J. Salon, R. Palma, C. C. Hwang, “Dynamic modeling of an 

induction motor connected to an adjustable speed drive,” IEEE Trans. 

Magn., vol. 25, no. 4, pp. 3061-3063, Jul 1989. 

[10] K. Delaere, W. Heylen, R. Belmans, K. Hameyer, “Comparison of 

induction machine stator vibration spectra induced by reluctance 

forces and magnetostriction,” IEEE Trans. Magn., vol. 38, no. 2, pp. 

969-972, Mar 2002 

[11] M. Besbes, Z. Ren, A. Razek, “Finite element analysis of magnetomechanical 

coupled phenomena in magnetostrictive materials,” IEEE 

Trans. Magn., vol. 32, no. 3, pp. 1058-1061, May 1996 

[12] A. Bossavit, “Edge-element computation of the force field in 

deformable bodies,” IEEE Trans. Magn., vol. 28, no. 2, pp. 1263- 

1266, Mar 1992. 

[13] J. L. Coulomb, “A methodology for the determination of global 

electromechanical quantities from a finite element analysis and its 

application to the evaluation of magnetic forces, torques and 

stiffness,” IEEE Trans. Magn., vol. 19. 6, pp. 2514-19, 1983. 

[14] O.A. Mohammed, T. Calvert, R. McConnell, “Coupled 

magnetoelastic finite element formulation including anisotropic 

reluctivity tensor and magnetostriction effects for machinery 

applications,” IEEE Trans. Magn., vol. 37, no. 5, pp. 3388-3392, Sep 

2001. 

[15] L. Vandevelde, J.A.A. Melkebeek, “Magnetic forces and 

magnetostriction in electrical machines and transformer cores,” IEEE 

Trans. Magn., vol. 39, no. 3, pp. 1618- 1621, May 2003. 

[16] A. Belahcen, “Vibrations of rotating electrical machines due to 

magnetomechanical coupling and magnetostriction,” IEEE Trans. 

Magn., vol. 42, no. 4, pp. 971-974, Apr. 2006. 

[17] A. Kameari, “Local calculation of forces in 3D FEM with edge 

elements,” International Journal of applied Electromagnetics in 

Materials, vol. 3, pp. 231-240, 1993. 

[18] A. Belahcen, “Magnetoelastic coupling in rotating electrical 

machines,” IEEE Trans. Magn., vol. 41, no. 5, pp. 1624-1627, May 

2005. 

[19] A. Belahcen, Magnetoelasticity, magnetic forces and 

magnetostriction in electrical machines. Doctoral dissertation, 2004, 

Helsinki University of Technology, Espoo, Finland. 

[20] S. Somkun, A. J. Moses, P. I. Anderson, P. Klimczyk, 

“Magnetostriction anisotropy and rotational magnetostriction of a 

nonoriented electrical steel,” IEEE Trans. Magn., vol. 46, no. 2, pp. 

302-305, Feb. 2010. 

[21] A. Belahcen, K. Fonteyn, A. Hannukainen and R. Kouhia, “On 

numerical modeling of coupled magnetoelastic problem”, Nordic 

Seminar on Computational Mechanics NSCM-21, pp. 203-206, Oct. 

16-17.2008, Trondheim, Norway. 

[22] A. Belahcen and M. El Amri, “Measurement of stress-dependent 

magnetisation and magnetostriction of electrical steel sheets,” 

International Conference on Electrical Machines ICEM, Sep. 5- 

8.2004, Cracow, Poland. 

[23] K. A. Fonteyn, Energy-based magneto-mechanical model for 

electrical steel sheets, Doctoral dissertation, 2010, Aalto University, 

Finland 

[24] K. A. Fonteyn, A. Belahcen, P. Rasilo, R. Kouhia, A. Arkkio, 

“Contribution of Maxwell stress in air on the deformations of 

induction machines,” Journal of Electrical Engineering & Technology, 

vol. 7, no. 3, pp. 336-341, 2012. 

[25] E. Dlala, A. Belahcen, K. Fonteyn, M. Belkasim, “Improving loss 

properties of the mayergoyz vector hysteresis model,” IEEE Trans. 

Magn., vol. 46, no. 3, pp. 918-924, March 2010. 

[26] A. Bergqvist, On magnetic hysteresis modelling, Doctoral 

dissertation, 1994, Royal Institute of Technology, Stockholm, 

Sweden 

[27] D. C. Jiles, D. L. Atherton, “Theory of ferromagnetic hysteresis 

(invited),” Journal of Applied Physics, vol. 55, no. 6, pp. 2115-2120, 

Mar 1984. 

[28] P. Rasilo, A. Belahcen, “Iron losses, magnetoelasticity and 

magnetostriction in ferromagnetic steel laminations,” IEEE 

Conference on Electromagnetic Field Computation CEFC, 11- 

14.11.2012, Oita, Japan. 

[29] Y. Kai, Y. Tsuchida, T. Todaka, M. Enokizono, “Influence of stress 

on vector magnetic property under rotating magnetic flux conditions,” 

IEEE Trans. Magn., vol. 48, no. 4, pp. 1421-1424, April 2012. 

[30] A. Belahcen, P. Rasilo, K. Fonteyn, R. Kouhia and A. Arkkio, 

“Modeling the stress effect on the measurement of magnetostriction in 

electrical sheets under rotational magnetization,” IEEE Conference on 

Electromagnetic Field Computation CEFC, 11-14.11.2012, Oita, 

Japan.


Magnetic Saturation Effect on Modeling Squirrel-cage 

Induction Motors with Stator Inter-turn Fault 

*Jawad Faiz, † Mansour Ojaghi and † Mahdi Sabouri 

*Center of Excellence on Applied Electromagnetic Systems, School of Electrical and Computer Engineering, 

College of Engineering, University of Tehran, Tehran, Iran, E-mail: jfaiz@ut.ac.ir 

† Department of Electrical Engineering, University of Zanjan, Zanjan, Iran 

Abstract— Coupled circuits model (CCM) of squirrel-cage induction motors is the most detailed and complete analytical 

model for analyzing the performance of the faulty induction motors. This paper extends the CCM to a saturable model 

including variable degrees of the saturation effects using an appropriate air gap function and novel techniques for locating 

the angular position of the air gap flux density and estimating the saturation factor. Comparing simulated and experimental 

magnetization characteristics shows the accuracy of the new saturable model. Using saturable and non-saturable models, 

various simulations are carried out on faulty induction motors, and then, by comparing the results, the impacts of the 

saturation on the performance of the faulty motor are presented. 

Index Terms—Induction motors, Inter-turn fault, Magnetic saturation, Coupled circuits model. 


Implementing a proper condition monitoring system is 

essential to prevent squirrel-cage induction motors 

(SCIMs) from catastrophic failure. The stator inter-turn 

fault is a relatively frequent fault and if not diagnosed on 

time, causes major breakdown of the SCIM. SCIM 

performance under the inter-turn fault can be analyzed 

using magnetically coupled circuits model (CCM) [1], 

[2]. Such analysis helps to realize the faulty SCIMs 

performance, to extract proper indexes for the various 

faults and to develop effective fault diagnosis and 

condition monitoring techniques for SCIMs. To do so, 

CCM must be as exact as possible; however, ignoring the 

magnetic saturation may keep it away from the required 

exactness [3]. 

For economical utilization of the magnetic material, 

electrical machines operating regions have to be extended 

above the knee of the magnetization characteristic, which 

forces the machine into the saturation region. Many 

attempts have been so far made to include saturation 

effects in SCIM models including CCM [4]–[9]. An 

extension to CCM of healthy SCIM, which includes 

variable degrees of saturation, has been reported in [9]. 

The proposed saturable CCM (SCCM) needs to track the 

air gap rotating flux density, and this has been done by a 

rather simple technique in [9]. However, distortion of the 

air gap flux density distribution due to the fault causes 

the application of that technique erroneous. Thus, the 

existing SCCM is not viable to analyze the faulty SCIMs. 

In this paper, the rotor meshes are used as the air gap 

flux samplers. The flux-linkages of the rotor meshes, 

calculated at each step, are used to estimate the air gap 

flux density distribution. Then, Fourier series analysis is 

used to determine the space harmonics of the air gap flux 

density, including fundamental harmonic amplitude (B1) 

and its phase angle (1). B1 is used to determine 

saturation factor (Ksat) properly and 1 is utilized to track 

the air gap flux density. Therefore, a modified version of 

the SCCM is obtained, whose accuracy is proved by 

comparing the magnetization characteristics determined 

through the simulation and experiments. Then, using 

both the modified SCCM and the normal CCM, the 

performance of SCIM with some inter-turn faults are 

simulated. By comparing the results, impacts of magnetic 

saturation on the faulty SCIMs performance and their 

fault indexes will be clear. Comparisons are also made 

with experimental results, which confirm the accuracy of 

the proposed model. 

II. SCCM OF SCIM WITH INTER-TURN FAULT 

Any loop on the rotor of SCIM consisting of any two 

adjacent rotor bars is considered as a circuit mesh. The 

shorted turns in the stator are also considered as an 

independent circuit (phase 'd'). Applying KVL to the 

rotor meshes, stator phases and stator shorted turns and 

adding the related torque and mechanical equations, 

CCM dynamic equations are attained. These equations 

with complete details are presented in [2]. Self/mutual 

inductances of the various circuits and their derivatives 

versus the rotor position are the most important 

parameters of CCM equations. These inductances are 

calculated using the following equation [10]: 

2 

1 

Lxy or 

l 

g n N d 

0 x y 

(1) 

where x and y can be any phase of the stator (a, b, c or d) 

or any mesh of the rotor (1 to R), μ0 is the air magnetic 

permeability, r is the air gap mean radius, l is the stack 

length, g -1 is the inverse air gap function, nx is the x 

phase (mesh) turn function and is the angle in the stator 

stationary reference. In the case of the uniform air gap 

machine, Ny is the winding function of the y circuit, but 

in the case of the non-uniform air gap machine, it is the 

modified winding function of the y circuit. 

Saturation of the magnetic material causes its 

reluctance to be increased against the machine's flux. 

Similar increase of the reluctance can be achieved by a 

proportional increase in the air gap length along the main 

flux path [4]. Anywhere within the core material, 

reluctance increase caused by the saturation depends on 

the exact value of the flux density, but independent of the 

flux direction. Thus, it is expected that the fictitious air 

gap length (gf) fluctuates a complete cycle every half

cycle of the air gap flux density distribution around the 

air gap. Assuming a sinusoidal form for this fluctuation, 

the following satisfies the mentioned requirements [9]: 

g f g[ 

1 

cos( 2P( 

f ))] (2) 

where f is the angular position of a zero crossing point 

of the air gap flux density in the stator reference, P is the 

pole pairs, g' is the mean value of gf and is the peak 

value of its fluctuation. g' and are determined as 

follows: 

3K 

sat 

g 

ge 

(3) 

K 2 

2( Ksat 

1) 

 

(4) 

3Ksat 

where ge is the effective air gap length of the unsaturated 

machine, which is related to the mechanical air gap 

length (g0) by the Carter's coefficient (ge=kcg0). Ksat is the 

saturation factor which is defined as the ratio of the 

fundamental components of the air gap voltage for the 

saturated and unsaturated conditions [4]. 

Replacing the inverse of (2) into (1), using modied 

winding function theory, assuming the turn functions to 

have only step variation in the center of the slots and 

determining the indefinite integrals, exact analytic 

equations are obtained for the various inductances. Then, 

differentiating the equations versus the rotor position 

(r), exact analytic equations also are obtained for the 

derivatives of the inductances. The equations are 

functions of Ksat and f, thus by using them; variable 

degree of saturation effect enters to the model of the 

SCIM. The equations are included in the appendix. 

For the proposed SCIM (see Section V), Figure 1 

shows the turn functions of the phase 'a' winding before 

and after inter-turn fault. The winding has 4 concentric 

coils each with 90 turns in healthy condition. In the faulty 

case 14 turns from the outer coils are short-circuited. 

Figure 1 also shows the turn function of the shorted turns 

(phase 'd'). As an example, Figure 2 shows the variations 

of the self inductance of the phase 'd' (Ldd) by the 

variations of Ksat and f, which has been calculated using 

the proposed analytical equations. As seen, by increasing 

Ksat and Ldd more variations occur around its decreasing 

mean value. The increase of g' and by increasing Ksat 

are respectively the reasons for the decreasing mean 

value and increasing uctuations amplitude of the 

inductance. A complete rotation of f causes two 

complete cycles of variation for Ldd, which is due to the 

two poles of the SCIM. 

sat 


Figure 1: Turn function of phase 'a' winding: a) before and b) after 

inter-turn fault with 14 shorted turns in the outer coil and c) turn 

function of phase 'd' after the fault occurrence 

Figure 2: Phase 'd' self inductance (Ldd) variation versus Ksat and f 

III. DETERMINING K SAT AND F IN FAULTY SCIM DURING 

SIMULATION STEPS 

Generally, the air gap flux density distribution is 

disturbed in the faulty SCIMs and this leads to an error in 

the use of the technique for determining f [9]. In 

addition, Ksat was determined using the air gap voltage, 

which depends on the rotation speed of the air gap flux as 

well as its amplitude, while the saturation degree depends 

only on the flux density amplitude. This bring difficulty 

when applying the model in variable-speed drives 

systems, as the air gap voltage before saturation is no 

more a constant, but varies by the reference speed 

variation. 

When simulating SCIM using the CCM or SCCM, 

flux-linkages of all the rotor meshes are evaluated within 

any simulation step. Since any rotor mesh consists of 

only one turn, these flux-linkages are the total fluxes 

passing through the meshes. Considering the short air 

gap length and ignoring the small flux leakages, the air 

gap flux density next to any rotor mesh i can be estimated 

as: 

Bai i 

A 

(5) 

where i is the flux-linkage of mesh i and A is the area 

above the mesh in the air gap: 

A 2rl 

R 

(6) 

where R is the rotor bars number. Therefore, the air gap 

flux density distribution is estimated within any 

simulation step. Then, using Fourier series analysis, the 

space harmonics of the air gap flux density is determined 

as follows: 

1 2 

R 1 

i1 

Bsn 

B( 

) sin( nP 

) d 

Bai 

sin( ) 

0 

nP 

d 

 

 

i 

(7) 

B 

cn 

1 

 

np 

 

 

R 

 

i1 

B [cos( nP 

) cos( nP 

) ] 

ai 

i1 

R 

1 

Bai[sin( 

nPi 

1) 

sin( nPi 

)] 

np 

i1 

where Bsn and Bcn are the sine and cosine components of 

the nth space harmonic of the estimated air gap flux 

density respectively, is the angle in the rotor reference 

and i the angle of center of the rotor bar i. Then, the 

phase angle of the space harmonics (in electrical 

i1 

R 

1 2 

1 

i1 

B( 

) 

cos( nP) 

d 

Bai 

0 

 

i 

i 1 

i 

cos( nP) 

d 

(8)

adians) and their amplitudes can be calculated as follows 

respectively: 

tan ( ) 

1 Bsn 

n 

(9) 

B 

cn 

2 

Bsn 

2 

Bn Bcn 

 

(10) 

Having the phase angle of the fundamental harmonic 

(1), f could be estimated by: 

1 

 

f r 

 

(11) 

P 2P 

Also using the amplitude of the fundamental harmonic, 

Ksat obtained by: 

1 0 B B Ksat (12) 

where B0 is related to the flux density of knee (Bkp) of the 

core material within the teeth. 

These modifications in f and Ksat estimations make 

the SCCM applicable to the simulation of mains-/driveconnected 

SCIMs under the inter-turn fault. However, 

knee point is not a distinct point on the magnetization 

characteristic and there is not precise analytic method to 

determine the Carter's coefficient (kc). Therefore, Genetic 

Algorithm (GA) is used for the optimal estimation of the 

required B0 and kc in the next section. 

IV. ESTIMATION OF B0 AND K C 

GA is a heuristic searching method for the optimal 

solution based on mechanics of natural selection and 

natural genetics. It evolves into new generations of 

individuals by using knowledge from the previous 

generations and generally includes three fundamental 

genetic operations of reproduction, crossover and 

mutation. The searching process is independent of the 

form of the objective function, and will not be trapped in 

the rapid descending direction introduced by the local 

optimum solutions. The solution of a complex problem 

can be started with weak initial estimations and then be 

corrected in evolutionary process of fitness. Figure 3 

shows a flowchart of the applied GA. More details about 

GA can be found in [11]. 

To use GA for estimating B0 and kc in the proposed 

SCIM, a proper objective (fitness) function must be 

defined. Such fitness function may be achieved by 

closely fitting the magnetization characteristic (i.e. the 

no-load voltage versus no-load current curve) of the 

SCIM obtained from SCCM to that obtained by 

experiments. To do so, the no-load stator RMS line 

currents are measured in the laboratory with n different 

stator line voltages up to 


Reproduction 

No 

Figure 3: Flowchart of the applied Genetic Algorithm 

Figure 4: Convergence rate of the algorithm 

the nominal voltage (I e ai, I e bi, I e ci, for i=1,2,…,n). Then, 

for any distinct values of B0 and kc, corresponding line 

currents with the same stator voltages are obtained by 

simulation (I s ai, I s bi, I s ci, for i=1,2,…,n). Now, the fitness 

function is defined as follows: 

n 

 

i1 

e 

ai 

s 2 

ai 

e 

bi 

Start 

Select variables 

(Solution space) 

Construct initial 

population randomly 

Calculate the fitness for 

each population 

Apply crossover 

Apply mutation 

Ending 

condition 

reached? 

Select the best population 

End 

Yes 

s 2 

bi 

Fit. 

(( I I ) ( I I ) ( I I ) ) (13) 

e 

ci 

s 2 

ci 

The lower the fitness, the better will be the estimation of 

B0 and kc. With a population size of 15, GA converges to 

the required solution after about 130 iterations. Figure 4 

shows the convergence rate of the algorithm. The 

optimum values for B0 and kc are 0.5007 and 1.2058 

respectively. Figure 5 compares the 

magnetization characteristics obtained from the 

experiment and SCCM with optimal B0 and kc. Good 

agreement between the simulated and experimental 

results is evident. 

V. EXPERIMENTAL TEST RIG 

A test rig consisting of a 750 W, 380 V, 50 Hz, 2-pole, 

Y-connected SCIM was set up in the laboratory. Three-

Figure 5: Magnetization Characteristic obtained from simulation () 

and experiments (---). 

Figure 6: Photograph of the test rig. 

phase windings of the stator of the SCIM removed and 

replaced by similar windings with various taps taken out 

from different turns of the phase ‘a’ winding. Inter-turn 

fault with variable number of shorted turns is produced in 

the SCIM by connecting any two of the taps. The motor 

is mechanically coupled with a magnetic powder brake to 

produce adjustable mechanical load. A digital scopemeter 

is used for sampling the line currents of the SCIM 

[12]. Two independent current or voltage signals can be 

sampled and recorded with 5000 samples per second. 

Figure 6 shows a photograph of the test rig. 

VI. SIMULATION AND ANALYSIS 

The proposed SCIM is simulated using the 

developed SCCM under various loads, supply and fault 

conditions and corresponding tests are performed on the 

real SCIM in the laboratory. The results are presented, 

compared and analyzed in this section. Figure 7 shows 

the variation of f with time in an interval between - to 

obtained during a simulation by the method introduced in 

the Section III. Constant slop of the variation of f is due 

to the constant speed of the air gap rotating magnetic 

field (the synchronous speed). 

Figure 8 shows the normalized spectra of the stator 

line current in the faulty SCIM with 21 shorted turns 

under no load. As seen, all the even/odd harmonics are 

present in the experimental and SCCM result but not in 

the CCM result. The amplitude of the 3 rd harmonic of the 

stator line current (150 Hz) might be considered as an 

index for the inter-turn fault [13], [14]. As seen, this 

amplitude in the SCCM result is very closer to the 

experimental result than that in the CCM result. 

The stator negative-sequence current component at the 

fundamental frequency is one of the old indexes 

introduced to diagnose the inter-turn fault [15], [16]. In 

the healthy symmetrical SCIM with the balanced threephase 

supply, the negative-sequence current is zero. 


Figure 7: Simulated time variations of f 

Figure 8: Normalized spectra of stator line current under no load with 

21 shorted turns obtained by: a) experiment, b) SCCM and c) CCM 

However, the inter-turn fault quickly increases this 

current component. To determine the negative sequence 

current at the required frequency, the related line current 

phasors of the stator are obtained first by using the 

sampled currents and the Fourier algorithm which is 

conventional in the field of the digital protection [17]. 

Then, using the line current phasors, the negative 

sequence current phasor is determined [15]. Knowing the 

amplitude, phase angle and frequency of the negative 

sequence current, its waveform can also be sketched. For 

the proposed SCIM with 21 shorted turns under full load, 

the negative-sequence currents obtained through the 

simulation and experiments have been shown in Figure 9. 

As seen, the saturation effect, introduced by the SCCM, 

increases the amplitude of the current in order to 

approach the experimental results 

Simulations and experiments on the proposed SCIM 

with 14 and 21 shorted turns were repeated under 

various load levels. Table I compares the attained 

amplitudes of the stator negative sequence current at the 

fundamental frequency. As seen, the negative sequence 

current increases by increasing the fault degree, while the 

load level change has negligible impact on the current. 

Also, the SCCM results follow the experimental results 

more 

closely than the CCM results. 

However, any negative sequence component in the 

TABLE. I

Current 

(mA) 


AMPLITUDE OF STATOR NEGATIVE SEQUENCE CURRENT UNDER VARIOUS LOAD LEVELS 

Faulty SCIM with 14 short-circuited turns Faulty SCIM with 21 short-circuited turns 

No 

load 

20% 

rated 

load 

40% 

rated 

load 

60% 

rated 

load 

80% 

rated 

load 

Full 

load 

No 

load 

20% 

rated 

load 

40% 

rated 

load 

60% 

rated 

load 

80% 

rated 

load 

Experimental 470 478 481 482 489 494 698 690 690 712 726 730 

SCCM 467 490 480 456 462 487 693 700 668 701 702 702 

CCM 413 457 438 411 422 430 619 623 609 647 648 650 

Figure 9: Stator negative sequence current obtained by a) experiment, b) 

SCCM and c) CCM in faulty SCIM with 21 shorted turns under fullload. 

mains voltage, which is permissible to some small extent 

in real mains, produces negative sequence current in the 

stator of the healthy SCIM. Simulation result indicates 

that with 2% negative sequence in the stator voltage of 

the healthy SCIM, the negative sequence current changes 

by 0.54% from no-load to full-load, by 1.86% from nonsaturable 

(CCM) to saturable motor (SCCM) and by 

6.65% from healthy to single-turn shorted condition (the 

weakest fault), while changing the negative sequence 

voltage from 2% to 5% changes the negative sequence 

current by 148.8% in the healthy SCIM. Therefore, the 

negative sequence current as an inter-turn fault index is 

highly sensitive to the voltage imbalance level, which is 

not pleasing as shown in Figure 10. 

The negative sequence apparent impedance of the 

SCIM is also used as an index to diagnose the stator 

inter-turn faults [18], [19]. This impedance is the ratio of 

the stator negative sequence voltage phasor to its 

negative sequence current phasor. Figure 11 shows the 

similar results with Figure 10 for this index which is 

obtained using simulation. As seen, this index has very 

smaller sensitivity to the voltage imbalance and the 

magnetic saturation is the most effective lateral factor 

affecting this index. 


The flux-linkages of the rotor meshes, calculated in 

every simulation step of the CCM for the SCIM was used 

to estimate the air gap flux density distribution. Then, 

space harmonic components of the air gap flux density 

were determined using Fourier series analysis. The phase 

angle of the space fundamental harmonic was utilized to 

locate the air gap flux density during simulation of the 

faulty SCIMs. Also, the amplitude of this fundamental 

harmonic is applicable to evaluate the saturation factor 

more reasonably. Therefore, a saturable CCM was 

Full 

load 

Figure 10: Sensitivity of stator negative sequence current to the weakest 

fault (single-turn), saturation, voltage imbalance and load level, 

evaluated by simulations. 

Figure 11: Sensitivity of the negative sequence apparent impedance of 

the SCIM to weakest fault (single-turn), saturation, voltage imbalance 

and load level, evaluated by simulations. 

developed which is capable to analyze faulty SCIMs. 

Comparing the simulation results with the corresponding 

experimental results indicates that the saturable model is 

more precise than the non-saturable model. Further study 

showed that the magnetic saturation affects the inter-turn 

fault indices more than the load level and the stator 

voltage imbalance. 

APPENDIX 

The indefinite integral of the inverse air gap function 

(gf -1 ) is determined first as follows: 

f ( , 

, K 

f 

sat 

 

1 

) g f ( , 

f , K sat ) d 

1 

cos( 2P( 

)) 

1 

 

cos 

f 

 

 

 

2 

2 P g 1 

1 cos( 2P( 

f )) 

 

(A1) 

Then, the analytical equation for the inductances of the 

`stator windings is obtained as: 

L 

x y 

m 

o 

r l 

nx 

( ti )[ n y ( ti ) f s ( f , K sat )] 

(A2) 

i1 

[ f ( 

i1 

, , K 

f 

sat 

) f ( , , K 

where x and y accounts for the stator phases, m is the 

number of the stator slots, i is the angle of center of the 

stator slot i, ti is the angle of center of the stator tooth 

after the stator slot i, and fs is: 

2 

g 

1 

m 

fs ( f , Ksat 

) ny 

( ti) 

[ f ( i1, 

f , Ksat 

) f ( i, 

f , K 

2 

i1 

i 

f 

sat 

)] 

sat 

)] 

(A3)

Equation (A2) is independent of r and its derivative 

versus r is zero for all x and y. For the rotor meshes the 

inductance equation is: 

Luv o r l [ C fr 

( f , K sat )] [ f ( x1, 

f , K sat ) f ( x, 

f , K sat )] (A4) 

now u and v accounts for the rotor meshes 1 to R, f is f 

in the rotor reference (f = f - r), u is the angle of center 

of the rotor bar number u in the rotor reference, C=1 for 

u = v and C=0 otherwise and fr is: 

2 

g 

1 

 

fr ( f , K sat ) [ f ( y1, 

f , K sat ) f ( y , f , K 

2 

sat 

)] 


(A5) 

Equations (A4)-(A5) depend on r because of f. 

Considering the relationship between these two variables 

yields Luv/r= -Luv/f and thus: 

Luv 

or 

l[ 

C fr 

( f , K 

 

r 

where: 

f 

r ( f , K 

o 

rl 

 

f 

sat 

f 

( u1, 

f , Ksat) 

f 

( u, 

f , Ksat) 

)] [ 

 

] 

 

 

sat 

 

)] 

[ f ( , , K ) f ( , , K 

u1 

f 

f 

sat 

u 

f 

f 

sat 

)] 

(A6) 

f r ( f , Ksat 

) g 

 

 

f 

2 

1 

 

2 

f 

( y 1, 

f , Ksat 

) f 

( y , f , Ksat 

) 

[ 

 

] 

 

f 

 

f 

(A7) 

f ( i , f , Ksat 

) 

1 

 

; i x1, 

x , y1, 

y 

 

f g[ 

1 

cos( 2P( 

i 

f ))] 

(A8) 

For the mutual inductances between the rotor meshes and 

stator phases the following equation is obtained: 

k2 

1 

i1 

 

i 

Lmn 

o 

r l 

[ nn 

( ) f s ( f , K sat )] 

ik 

2 

1 

(A9) 

[ f ( , , K ) f ( , , K )] 

i 

f 

sat 

i1 

where x and y account for the rotor meshes and stator 

phases respectively, k1-1 and k2+1 are the angles of the 

two bars of mesh x in the stator reference, k1 to k2 are the 

stator slots between k1-1 and k2+1 and e.g. k1 is the angle 

of the stator slot k1. Now k1-1 and k2+1 are responsible to 

the dependency of inductances on r. The derivatives of 

the two mentioned parameters versus r are equal to 1, 

while the derivatives of the other parameters in (A9) 

versus r are zero. Using these facts and some 

differentiation rules leads to: 

Lx 

y 

 

r 

 

o r l 

k1 

 

k11 

[ n y ( ) f s ( f , K sat )] 

2 

g[ 

1 

cos( 2P( 

k11 

 

f ))] 

 

o 

r l 

k 21 

 

k 2 

[ n y ( ) f s ( f , K sat )] 

2 

g[ 

1 

cos( 2P( 

 

))] 

k 21 

REFERENCES 

f 

f 

sat 

(A10) 

[1] X. Luo, Y. Liao, H. A. Toliyat, A. El-Antably and T. A. Lipo, 

“Multiple coupled circuit modeling of induction machines,” 

IEEE Trans. Ind. Applications, vol. 31, pp. 311 - 318, 

March/April 1995. 

[2] A. Raie, and V. Rashtchi, "Using a genetic algorithm for 

detection and magnitude determination of turn faults in an 

induction motor", Springer-Verlag, vol. 84, pp. 275–279, August 

2002. 

[3] S. Nandi, “Detection of stator faults in induction machines using 

residual saturation harmonics,” IEEE Trans. Ind. Applications, 

vol. 42, no. 5, pp. 1201 - 1208, 2006. 

[4] J. C. Moreira and T.A. Lipo, “Modeling of saturated ac machines 

including air gap flux harmonic components,” IEEE Trans. Ind. 

Applications, vol. 28, pp. 343 - 349, March/April 1992. 

[5] D. Bispo, L. M. Neto, J. T. Resende and D. A. Andrade, “A new 

strategy for induction machine modeling taking into account the 

magnetic saturation ,” IEEE Trans. Ind. Applications, vol. 37, no. 

6, pp. 1710 - 1719, Nov./Dec. 2001. 

[6] T. Tuovinen, M. Hinkkanen, and J. Luomi, “Modeling of 

saturation due to main and leakage fux interaction in induction 

machines,” IEEE Trans. Ind. Applications, vol. 46, no. 3, pp. 

937 - 945, 2010. 

[7] Tu Xiaoping, L.-A. Dessaint, R. Champagne, and K. Al-Haddad, 

“Transient modeling of squirrel-cage induction machine 

considering air-gap flux saturation harmonics,” IEEE Trans. Ind. 

Electronics, vol. 55, no. 7, pp. 2798 - 2809, 2008. 

[8] S. Nandi, “A detailed model of induction machines with 

saturation extendable for fault analysis,” IEEE Trans. Ind. 

Applications, vol. 40, pp. 1302 - 1309, September/October 2004. 

[9] M. Ojaghi and J. Faiz, “Extension to multiple coupled circuit 

modeling of induction machines to include variable degrees of 

saturation effects,” IEEE Trans. Magn., vol. 44, no. 11, pp. 

4053-4056, Nov. 2008. 

[10] J. Faiz, and I. Tabatabaei, "Extension of winding function theory 

for nonuniform air gap in electric machinery," IEEE Trans. Magn., 

vol. 38, pp. 3654-3657, November 2002. 

[11] Z. Michalewicz, Genetic Algorithms & Data Structures, Evalution 

Programs, Springer-Verlag, 1992. 

[12] Fluke 196c/199C Scope-Meter User's Manual, Fluke 

Corporation, Oct. 2001, Netherlands. 

[13] G. Joksimovic, J. Penman, “The detection of interturn short 

circuits in the stator windings of operating motors,” IEEE Trans 

Ind. Electronics, vol. 47, no.5, pp.1078–1084, Oct. 2000. 

[14] J.H. Jung, J.J. Lee, and B.H. Kwon, “Online diagnosis of 

induction motors using MCSA,” IEEE Trans. Ind. Electronics, 

vol. 53, no. 6, pp. 1842–1852, Dec. 2006. 

[15] A.Bellini, F.Filippetti, C.Tassoni, G.A.Capolino, “Advances in 

Diagnostic Techniques for Induction Machines,” IEEE Trans. 

Industrial Electronics, vol.55, no12, pp. 4109-4126, Dec. 2008. 

[16] Wu Qing, and S. Nandi, “Fast single-turn sensitive stator interturn 

fault detection of induction machines based on positive- and 

negative-sequence third harmonic components of line currents,” 

IEEE Trans. Ind. Applications, vol. 46, pp. 974 - 983, 2010. 

[17] A.T.Johns and S.K. Salman, Digital Protection for Power 

Systems. IEE Power series 15, London, UK 1995. 

[18] J.L. Kohler, J. Sottile, and F.C. Trutt, “Condition monitoring of 

stator windings in induction motors: I. Experimental 

investigation on effective negative-sequence impedance 

detector,” IEEE Trans. Ind. Application, vol. 38, pp. 1447–1453, 

2002. 

[19] L. Sang Bin, R.M. Tallam, and T.G. Habetler, “A robust, on-line 

turn-fault detection technique for induction machines based on 

monitoring the sequence component impedance matrix,” IEEE 

Trans. Power Electronics, vol. 18, pp. 865–872, 2003.


Accurate Magnetostatic Simulation of Step-Lap 

Joints in Transformer Cores Using Anisotropic 

Higher Order FEM 

A. Hauck∗ , M. Ertl † ,J.Schöberl ‡ and M. Kaltenbacher § 

∗ SIMetris GmbH, Erlangen, Germany † SIEMENS Transformers, Nuremberg, Germany 

‡ Institute for Analysis and Scientific Computing, Vienna University of Technology, Austria 

§ Institute of Mechanics and Mechatronics, Vienna University of Technology, Austria 

E-mail: andreas.hauck@simetris.de 

Abstract—We present a simulation scheme for the accurate simulation of thin magnetic structures, specifically the nonlinear 

magnetic flux distribution in a core step-lap joint with interest in the local saturation near the air gaps. Due to the high 

aspect ratio of the model, we utilize hierarchical higher order finite elements, where the polynomial degree is spatially 

adapted to resolve the flux distribution within the steel sheets. The deterioration of convergence in the iterative conjugate 

gradient (CG) solver is handled by an anisotropic Schwarz-type block preconditioner, grouping the unknowns depending 

on the aspect ratio of the elements. The resulting Newton scheme can optionally be accelerated by a 2-step solution strategy, 

where a start value is computed on a coarse subspace of lowest order in analogy to a full multigrid scheme. 

Index Terms—Step-Lap Joints, Higher Order Finite Elements, Block Preconditioner, Nonlinear Solver 


Precise knowledge about the accurate flux distribution 

in transformer cores is both important for reducing the 

magnetic losses, as well as for localizing sources for 

forces (magnetostriction, interlaminar forces). In recent 

years significant reduction of both effects was achieved 

using the multi-step-lap technique (see Fig. 1 and 2), 

where the overlap region of transformer sheets is shifted 

in several steps [1]. In order to optimize the layout 

further, a detailed simulation of the fluxes, including the 

nonlinear B-H curve of the core has to be performed. 

Accurate simulation of such a problem poses some diffi- 

Fig. 1. Sketch of transformer core 

with step-lap corners. 

A 

A 

View A – A 

Fig. 2. Magnetic flux concentration 

(45 ◦ -view). 

culties, as the thickness of the steel sheets is typically in 

the range of 200 - 300 μm, while the length can be a few 

meters and the number of vertically stacked sheets add up 

to some thousand layers. In addition, the flux variation is 

very high in the vicinity of the air gaps in the corner, but 

rather smooth in some distance away from it, making it 

difficult to resolve accurately. Furthermore, the nonlinear 

permeability of the grain-oriented electrical steel sheets 

has to be taken into account and modeled correctly. 

Within this paper we solve the nonlinear magnetostatic 

problem using finite elements of higher order, together 

with an iterative preconditioned conjugate gradient (CG) 

method. By exploiting the special structure of the Finite 

Element (FE) basis [2], we can build an effective 

preconditioner for handling elements with high aspect 

ratios, while at the same time the spatial accuracy can be 

adapted to the discretization of the model. The nonlinear 

Newton scheme can be further accelerated by calculating 

a good initial start value on a coarse sub-space. Finally, 

the applicability of the method is demonstrated for the 

aforementioned step-lap core model. 

II. NONLINEAR MAGNETOSTATIC FORMULATION 

The nonlinear magnetostatic problem can be written in 

terms of the magnetic vector potential A as 

∇×ν(|∇ × A|)(∇×A) =J + ∇×νB0 

B = ∇×A , (1) 

where ν(|∇ × A|) is the nonlinear reluctivity (e.g. of 

steel), J the impressed current density (e.g. of a coil) 

and B0 an additional prescribed flux density. As the 

impressed current density J and the curl of the prescribed 

flux density B0 are assumed to be divergence free, a 

unique solution can either be guaranteed by enforcing 

the Coulomb gauge 

∇·A =0 (2) 

explicitly (leading to a mixed formulation) or by adding 

a small regularization term αA to (1), where α ≈ 10 −6 ν 

[2]. We will modify this strategy in Section III. 

For physical reasons, the vector potential A is only 

continuous in the tangential part, which requires the use 

of H(curl)-conforming vectorial elements, which will be 

introduced in Sec. III.

The nonlinear problem is solved using a Newton 

formulation 

F ′ (Ak)[ΔA] =−F(Ak) (3) 

Ak+1 = Ak + ηΔA . (4) 

Improvement of convergence is achieved by computing 

a line search parameter η ∈ ]0, 1], which is determined 

by minimizing the residual of (3). The magnetic B-H 

commutation curve is extracted from measured hysteresis 

curves, reducing the problem to a simplified nonlinear 

one, which is given in terms of measured (Bi,Hi)-value 

pairs. By applying a C 1 -spline approximation, a smooth 

monotone approximation is calculated, from which the 

reluctivity ν and its derivative ν ′ (entering the Fréchet 

derivative F ′ ) can be derived (see Fig. 3). For details we 

refer to [3]. 

Fig. 3. Approximation of measured B-H-data by C 1 -splines. 

III. HIGHER ORDER FINITE ELEMENTS 

The two key requirements for our choice of higher 

order H(curl)-conforming FE shape functions are the 

ability to choose the polynomial degree p independently 

in each local direction (ξ,η,ζ), as well as the availability 

of efficient iterative solution techniques (i.e. an efficient 

preconditioner). The first requirement leads to the use 

of hierarchical shape functions, where we we utilize the 

hierarchical shape functions of [2], which can be written 

as 

N(T )= 

N 0 E ⊕ 

N ∇ E ⊕ 

NF ⊕ 

E 

E 

F 

N ∇ F ⊕ NI ⊕ N ∇ I 

The shape functions N are composed of unknowns 

defined on edges, faces and in the interior (subscripts 

E, F and I), see Fig. 4 for a hexahedral element. 

The lowest order Nédélec functions N0 E , which have a 

pζ pη pξ E 5 

E 12 

F 2 

E 4 

E 8 

E 1 

ζ 

η 

ξ 

F 1 

F 6 

E 9 

E 11 

E 3 

E 6 

E 10 

E 2 

F4 E7 Fig. 4. Degrees of freedom for the hexahedral element. 


(5) 

constant tangential component along one edge (p = 0), 

are explicitly included. In addition, higher order gradient 

components on edges N∇ E , faces N∇F and in the interior 

N∇ I are represented separately. This key feature - also 

known as the local exact sequence property - is equivalent 

to fulfilling the so-called De-Rham complex 

R id 

−→ H 1 (Ω) grad 

−→ H(curl,Ω) 

curl 

−→ H(div,Ω) div 

0 

−→ L2(Ω) −→ {0} (6) 

already on the finite element level, i.e. the gradients, 

forming the null-space of the curl-operator, can be completely 

omitted for each type of unknowns (edge, face, 

interior) separately if only the flux density B is of 

interest. In [2], this is denoted as the reduced basis and 

can be used to gauge the problem in the following way: 

• For the lowest order Nédélec functions N0 E , we add 

the regularization term α as described in Sec. I. 

• For the higher order terms, we simply skip the 

gradient functions N∇ E , N∇F and N∇I . 

Another unique advantage of this basis according to [9] 

is that shape functions of arbitrary order are available 

for all types of elements in 2-D and 3-D, utilizing any 

kind of hierarchical 1-D shape functions, e.g. Legendre 

or Gegenbauer. 

A. Anisotropic Adapted Polynomial Degree 

In general the magnetic flux density B is defined as 

⎛ 

B = ⎝ Bx 

⎞ 

⎛ 

⎞ 

∂Az ∂Ay 

∂y − ∂z 

By ⎠ ⎜ ∂Ax ⎟ 

= ∇×A = ⎝ 

⎠ . (7) 

Bz 

∂z 

∂Ay 

∂x 

∂Az 

− ∂x 

∂Ax 

− ∂y 

In case of thin structures, the in-plane components 

(Bx,By) are dominant (see Fig. 5). Additionally, the 

variation of the in-plane components of A in z-direction 

( ∂Ax ∂Ay 

, ) in (7) is already resolved accurately by the 

∂z 

∂z 

FE-discretization in thickness direction of the single sheet 

, i.e. 

layers. Thus the dominant terms left are ∂Az 

∂y 

A (z) 

z 

z 

y 

x 

z 

B (x) 

y 

A (x) 

z 

A (x) 

z 

x 

and ∂Az 

∂x 

Fig. 5. Flux / potential distribution in face on (x, z)-plane. 

the magnetic vector potential A should be approximated 

quite accurately in the in-plane direction. 

Therefore, it is advantageous to reflect his behavior in 

the anisotropic polynomial degree as 

pη,pξ >pζ , (8) 

assuming that the global z and the local ζ direction 

coincide. The increase of the polynomial degree only 

affects the face NF and inner NI degrees of freedoms,

as we skip higher order gradient functions N ∇ E and N∇ F 

due to gauging. This leads to practical order templates 

like paniso =(2, 2, 1) or paniso =(3, 3, 1). Although it 

seems that the lowest order anisotropic template should 

be paniso =(1, 1, 0), this does not lead to more accurate 

results, as only faces in (x, y)-direction get additional 

unknowns, which do not contribute to an improved 

resolution of Az. 

As the permeability in air μ0 is typically several 

orders of magnitude smaller compared to the one in the 

ferromagnetic core, the flux is mostly concentrated in 

the core. This allows us the choose a small isotropic 

polynomial degree of pair = 0 for the approximation. 

The last step is especially effective, if structured grids 

are utilized or if the air domain is significantly large. 

B. Single Step Iterative Solution Scheme 

The resulting system of equations after FE discretization 

can be written as 

with 

⎛ 

⎝ 

KN0N0 KN0F KN0I 

KF N0 KFF KFI 

KIN0 KIF KII 

K(A)A = f , (9) 

⎞ ⎛ 

⎠ ⎝ 

AN0 

AF 

AI 

⎞ ⎛ 

⎠ = ⎝ 

fN0 

fF 

fI 


⎞ 

⎠ . 

(10) 

The interior unknowns AI can be eliminated by static 

condensation as 

AI = K −1 

II (fI − KIN0 AN0 − KIFAF ) , (11) 

where K −1 

II can be inverted on the element level. Substituting 

this result back in (10) results in the reduced 

system 

 

ˆKN0N0 ˆKN0F 

ˆfN0 

AN0 = , (12) 

ˆKF 

ˆKFF AF ˆfF 

N0 

with the modified matrices and RHS vectors as 

−1 

= KN0N0 − KN0I(KII )KIN0 

ˆKN0F = KN0F − KN0I(K −1 

II )KIF 

ˆKN0N0 

ˆKF N0 = KF N0 − KIF(K −1 

II )KN0I 

(13) 

(14) 

(15) 

ˆKFF = KFF − KIF(K −1 

II )KFI (16) 

ˆfN0 = fN0 − KN0I(K −1 

II )fI (17) 

ˆfF = fF − KFI(K −1 

II )fI . (18) 

The two main effects of the static condensation are: 

• The number of unknowns is reduced significantly, 

as with increasing polynomial degree p only face 

and interior unknowns are added. 

• The condition number κ of the reduced system (12) 

is much smaller compared to the one of the full 

system (10), causing less iterations of the CG solver. 

In order to solve the reduced system (12), we apply a 

Preconditioned Conjugate Gradient (PCG) method. It was 

shown in [2], that a α-robust preconditioner C−1 can be 

simply defined by a block Jacobian preconditioner (i.e. 

an additive Schwarz method, ASM), defined by 

 

C = 

ˆKN0N0 

0 

0 

Kˆ B 

FF 

 

, (19) 

where the single blocks are formed as follows: 

• ˆ KN0N0 : The lowest order Nédélec functions can be 

either solved by a sparse direct solver [10] or by a 

suitable iterative method, respecting the Helmholtzdecomposition 

of the magnetic vector potential (see 

e.g. [11]). 

: For every face, all unknowns are grouped in 

• ˆ KB FF 

one block (superscript B). The application of the 

preconditioner basically is just the inversion of the 

single face blocks K −1 

FF . 

IV. ANISOTROPIC BLOCK PRECONDITIONER 

If the preconditioner (19) is applied for structures with 

a very high aspect ratio (AR), the convergence of the 

iterative solver deteriorates. This can be explained by an 

increase in the condition number κ = λmax/λmin, asthe 

entries in the stiffness matrix K scale with 1/h, where h 

is the mesh size, leading to strongly coupled entries for 

nearly parallel edges / faces and thus to nearly singular 

systems with high condition numbers [5]. 

The idea proposed in [5] is based on a singularity 

decomposition technique, where new unknown variables 

are introduced and assigned to groups of parallel edges 

with small distance. However, this method introduces 

new matrix entries, as all edges in one group couple via 

the auxiliary variable. In addition, the method is only 

applicable to 1st order elements, as only edge degrees of 

freedom are considered. 

An alternative approach is taken in [4], where a plane 

smoother for nodal and edge components of the A − φformulation, 

respecting the Helmholtz-decomposition, is 

applied within a geometric multigrid (MG) solver. Again, 

explicit knowledge of the anisotropic direction is needed 

a priori. In our approach, the idea of [4] is extended to 

η 

ζ 

ξ 

F3 

F2 

F1 

F8 

F7 

F6 

F5 

F4 

thin direction(s) 

Fig. 6. Thin structure with 2 distinct face groups {F1, F2, F3} and 

{F4, ..., F10}, where η is the long direction. 

the p-version of the FEM. As the lowest order edge contributions 

KN0N0 are already solved with a direct solver 

and the inner degrees of freedom KII get eliminated by 

static condensation, only the face contributions ˆ KFF are 

affected by the anisotropy. Thus we can modify the initial 

face blocks ˆ KB FF of the preconditioner matrix C (19) by 

grouping all unknowns of the faces perpendicular to the 

F10 

F9

thin direction in one diagonal block ˆ K Bai 

FF , if the aspect 

ratio of the element exceeds a user-defined threshold 

ARth. We can even generalize the idea, allowing for two 

anisotropic / thin directions within one element, e.g. in 

Fig. 6 the faces F4 to F10 couple strongly, as the size in 

both, ξ- and ζ-direction, is small compared to the extend 

in η-direction. This is especially useful in meshes with 

tensor-product structure. 

The modified preconditioner matrix is then defined as 

 

ˆKN0N0 0 

C = 

0 Kˆ Bai . (20) 

FF 

The procedure for computing ˆ K Bai 

FF without explicit 

knowledge of the thin direction(s) is sketched in Algorithm 

1. It collects strongly coupled (thin) faces in a graph 

and determines the blocks of ˆ K Bai 

FF by calculating the set 

of connected components of it. As the only information 

needed for the algorithm is the size of the elements in 

each direction, the procedure can be applied to general 

3-D elements (tetrahedron, wedge, pyramid). 

Algorithm 1: Definition of anisotropic face blocks. 

Input: elements e of mesh T 

Output: groups of thin faces Gi,i=1,...,nG 

Data: graph of connected anisotr. faces G=(V,E) 

foreach e ∈T do 

if ARmax(e) ≥ ARth then 

compute size of element w.r.t. local 

directions (hξ,hη,hζ) 

hmax = max(hξ,hη,hζ) 

foreach d ∈{ξ,η,ζ} do 

if hd/hmax ≥ ARth then 

get faces F1, F2 perpendicular to 

d-direction 

insert (F1,F2) in G 

nG = # of non-connected components of G 

for i =1to nG do 

Gi = connected faces in i-th component of G 

V. NONLINEAR TWO-STEP STRATEGY 

In order to accelerate the resulting Newton scheme, 

we utilize a 2-step strategy, motivated by full multigrid 

methods (see [6] and [8]). The idea is to compute a 

good start approximation for the Newton scheme on a 

coarse sub-space T H , which is formed in the p-version 

by the lowest order functions KN0N0. The fine space T h 

is spanned by the complete set of polynomials [7]. Thus 

the stiffness matrix K in (10) can be splitted into KH 

and Kh as follows 

KH = K00 

Kh = K = 

 

= KN0N0 (21) 

 

. (22) 

K00 K01 

K10 K11 

Due to the hierarchical basis functions, the interpolation 

operator Ih H is trivially defined as 

Ah =[I, 0] T AH = I h HAH . (23) 


15 cm 

y 

z 

x 

30 cm 

0.96 mm 

1 layer 

air gaps 

(exploded view) 

Fig. 7. Model of step-lap core without air domain (scale factor 30 in 

thickness direction). 

This allows us to perform a 2-step solution strategy as 

follows: 

1) Solve a few Newton steps on the small coarse 

system KN0N0 (21), using a direct solver. 

2) Interpolate the coarse space solution AH to the fine 

space Ah according to (23). 

3) Proceed with the full system Kh (22) using the 

solution strategy of Section III-B. 

VI. APPLICATION: STEP-LAP CORE MODEL 

The applicability of the method is demonstrated for a 

typical 45◦-multi-step-lap joint region of a transformer 

core (see Fig. 7) with 4 layers of steel sheet, each 

0.24 mm in thickness, with a step-lap of 2 air gaps 

(width: 1 mm) in each sheet. The model is discretized 

by 2568 hexahedral elements and 3172 nodes. The used 

nonlinear B-H curve is the one depicted in Fig. 3. As 

excitation, we apply a prescribed flux density B0 = 

0.1 − 2.5 Tiny-direction. 

Fig. 8. Concentration of magnetic flux lines in corner (45 ◦ -view) for 

uniform p =0(top) and p =3(bottom) with B0 =1.0 T (scale 

factor 30 in thickness direction). 

A. Initial Results 

Initially, we choose an isotropic polynomial degree 

p =0,...,3 and compare the spatial resolution of the 

magnetic flux density near the air gaps. Here, the Newton 

algorithm takes between 3 and 9 iterations. 

The results of the simulation are visualized on a very 

fine postprocessing mesh. In Fig. 8 it is clearly visible, 

B 0

Fig. 9. Flux distribution for uniform p =0(top) and p =3(bottom) 

with B0 =1.0 T (scale factor 30 in thickness direction). 

that the continuation of the fluxlines across the air gaps 

is poorly approximated for p =0and that the curvature 

of the streamlines is unphysical between the air gaps. In 

contrast, the simulation using p =3resolves accurately 

the flux concentration above and below the air gaps 

(depicted in red). The same observation holds true for 

the absolute value of the magnetic flux density in Fig. 

9, where the flux concentration between the air gaps is 

smeared over a large area for p =0. 

From Fig. 10 we deduce, that the iterative method is 

by a factor 2 to 10 slower compared to the direct method 

for all excitation values B0. In contrast, the memory 

consumption is only about 50% compared to the direct 

one for higher polynomial degrees p, as seen in Table I 

(SC denotes the use of static condensation). 

Fig. 10. Simulation time for direct and iterative solution approach 

without anisotropic block preconditioner. 

TABLE I 

MEMORY REQUIREMENT AND DOFS FOR DIFFERENT POLYNOMIAL 

DEGREES (SC: STATIC CONDENSATION) 

Polynomial Degree piso 

0 1 2 3 

# Total DOFs 7824 43956 141840 332292 

# Inner DOFs - 12840 71904 208008 

Memory Usage (GB) 

Direct Solver 0.24 0.39 1.12 3.61 

Direct Solver (SC) 0.24 0.37 0.92 2.32 

Iterative 1-step (SC) 0.24 0.28 0.53 1.61 


B. Use of Anisotropic Block Preconditioner 

The poor runtime performance of the iterative 1-step 

scheme in Fig. 10 can be explained by the extremely 

high aspect ratios up to 1:1000 in Fig. 11: All elements 

within the steel sheets have aspect ratios higher than 

1:400, leading to over 3000 CG iterations on average. 

Fig. 11. Aspect ratio of step-lap setup (not shown for elements in air). 

If we utilize the anisotropic preconditioner 

(20) for varying aspect ratio thresholds 

ARth = {1000, 500, 100, 50, 10} the iteration numbers 

and time for solving the linear equation system drops 

significantly (see Fig. 12). The results are compared for 

the iterative 1-step solver with B0 = 1.0 T. From an 

Fig. 12. Reduction of CG iterations (left) and solution time (right) 

for varying aspect ratio threshold ARth. 

initial CG iteration count of about 3000 (ARth = 1000) 

we achieve an average reduction to 100-150 iterations 

for ARth = 10, corresponding to a factor of 20-30, 

depending on the polynomial degree. 

The effect on the solution time is similar, where 

a reduction by a factor of 9 (p=3) to 25 (p=1) can 

be achieved, making it comparable in runtime to the 

direct solver. The increase in memory for storing larger 

diagonal blocks is very moderate, being in the range of 

5-15% compared to the non-blocked version. 

The rate of reduction in iterations is not heavily 

depending of the polynomial degree, making the preconditioner 

a p-robust method for practical applications. For 

all the following results, we apply the preconditioner with 

a default threshold of ARth =10.

C. Application of Two-Step Approach 

By applying the 2-step strategy of Section V, an 

additional decrease in runtime can be observed (see Fig. 

13). We start here with 2 Newton iterations on the coarse 

Fig. 13. Runtime comparison of standard 1-step iterative solution 

approach and 2-step approach. 

space T H with p =0and use it as start value for the 

fine space T h . On average, this saves 1 to 2 Newton 

iterations on the fine space, resulting in a reduction of 

runtime between 4% and 48%, which is in a similar range 

as reported in [8]. However, the effect diminishes with 

higher flux values for all polynomial degrees. 

D. Anisotropic Polynomial Degree 

Finally, we utilize the strategy as explained in Section 

III-A by reducing the polynomial degree anisotropically 

in thickness direction pζ


Parameter Identification of a Finite Element 

Based Model of Wound Rotor Induction Machines 

*Martin Mohr, *Oszkár Bíró, *Andrej Stermecki and † Franz Diwoky 

* Christian Doppler Laboratory for Multiphysical Simulation, Analysis and Design of Electrical Machines at the 

Institute for Fundamentals and Theory in Electrical Engineering, Inffeldgasse 18, A-8010 Graz, Austria 

† AVL List GmbH, Hans-List-Platz 1, A-8020 Graz, Austria 

E-mail: martin.mohr@TUGraz.at 

Abstract—This paper presents two efficient algorithms for the parameter identification in a finite element based circuit model 

of a wound rotor induction machine. This approach uses magneto-static finite element method simulations for building lookup 

tables and employs quint-cubic splines for the interpolation. A reduction of the finite element simulation cost is achieved by 

decreasing the number of nonlinear iterations using a special simulation order keeping the magneto-motive force in the 

machine for several sampling points constant. Furthermore, the quint-cubic spline parameter calculation method has been 

revised using a dimensional recursive evaluation approach allowing a fast parameter calculation with lower memory demand 

and offering possibility of parallelization. 

Index Terms— Finite element methods, motor drives, numerical models. 


Finite element (FE) based models, in particular 

physical phase variable (PPV) models, use look-up tables 

(LUT) generated by magneto-static finite element method 

(FEM) simulations. During the following transient 

simulations, only the evaluation of these LUTs by an 

appropriate interpolation method is needed. Several 

applications of this approach to electrical machines have 

already been published [1]-[8]. 

For permanent magnet machines, this approach is 

straightforward [1]-[5]. Typically three independent state 

variables are sufficient for prescribing the state of the 

machine. Therefore, no more than a few hundred or 

thousand FEM simulations are needed and a three 

variable interpolation method is sufficient. 

In contrast, wound rotor induction machines (WRIM) 

necessitate a higher number of state variables. This is a 

consequence of the second coil system on the rotor. The 

number of simulations Ntot increases exponentially with 

the number of state variables: 

Ntot Ni 

(1) 

states 

where Ni is the number of sampling points for the i th state 

variable. Furthermore, a higher dimensional interpolation 

method is necessary. Nevertheless, several 

implementations of FE-based WRIM models are found in 

the literature [6]-[8]. 

This work refers to the PPV-model of a WRIM with 

stranded coils introduced in [8] and can be perceived as 

an addendum to it. Since the parameter identification has 

not been treated in [8], this paper presents the practical 

application of this model. 

The drawbacks of the PPV-WRIM model are the very 

high number of needed FEM-simulations and the 

demanding quint-cubic spline parameter calculation. Both 

are caused by the number of state variables of the model 

being five as shown in the next section. 

II. INTRODUCTION TO THE PPV-WRIM MODEL 

This model uses the electrical rotor position Rot and 

S S 

two transformed currents for each of the stator i, i and 

R R 

the rotor i , i as state variables. This reduction can be 

done under the assumption that the rotor and stator 

systems are in Y-connection with isolated star point. 

The current state variables used are the phase currents 

S S S 

R R R 

of the stator iA, iB, i C and rotor iA, iB, i C transformed 

into the rotor related reference system: 

S S S R R R 

iA iB iC 0, iA iB iC 0, 

S 1 0 S 

i 

 

 

cosRot sin 

Rot 2 i 

1 2 A 

 

 

S , S 

i 

 

sinRot cosRot 3 i 

(2) 

 

 

 

 

B 

3 3 

R 1 0 R 

i 

 

2 i 

1 2 A 

 

 

R R . 

i 

 

3 i 

 

 

 

B 

3 3 

The rotor position is derived from the mechanical 

model and is an input variable for the electrical machine 

model. 

The used LUTs of this model approach contain the 

S S S 

phase flux linkages of the stator A, B, C 

and of the 

R R R 

rotor A, B , C 

for the electrical system of equations as 

well as the machine torque T. 

All these quantities are parameterized by the five state 

variables: 

S S R R 

LUT f Rot , i, i, i , i 

. 

(3) 

In contrast to [6], the total machine torque T is also 

included in the LUT. 

The electrical system of equations uses the line to line 

quantities defined as 

S S S S S S S S S S S S 

vAB: vBvA, vBC : vCvB, iAB : iBiA, iBC : iCiB R R R R R R R R R R R R 

vAB: vBvA, vBC : vCvB, iAB : iBiA, iBC: iC iB 

(4) 

S S S 

with the phase voltages of the stator v , v , v and of the 

A B C

R R R 

rotor vA, vB, v C . This system can be rewritten as 

 

S S S S S 

AB AB AB AB 

Rot 

AB 

 

dt 

Rot iS iS iR i 

R 

S 

di 

 

S 

S S 

S S S S S 

 

v 

AB RCUi AB 

BC BC BC BC 

 

BC 

 

S 

dt 

 

S S 

v 

BC RCUi 

 

BC Rot iS iS iR i 

 

 

 

S 

R 

 

di 

 

R R R R 

v 

AB RCUi 

 

R R R R 

 

 

AB 

 

 

(5) 

AB AB AB AB AB 

R R R 

 

dt 

v BC RCUi R 

BC 

 

Rot 

iS iS iR i 

 

R 

di R R R R R 

 

BC BC BC BC BC dt R 

 

di 

Rot i S i S i R i 

 

R 

dt 

S 

with the stator phase resistance R CU and the rotor phase 

R 

resistance R CU . 

For the evaluation of the function values as well as the 

partial derivatives, a quint-cubic spline interpolation has 

been suggested in [8]. Furthermore, it has been shown 

that this model is equivalent to a FEM model in all but 

interpolation errors. However, the parameter 

identification process has not been discussed there. 

In this work, an improved simulation workflow for the 

needed FEM simulations is proposed and presented. 

Furthermore, a revised quint-cubic spline parameter 

calculation method is introduced in detail. 

III. IMPROVED FEM SIMULATION WORKFLOW 

The motivation for an improved simulation workflow is 

the circumstance that the number of magneto-static FEM 

simulations needed to achieve an accurate interpolation is 

higher than in usual applications. However, there are only 

slight changes in the input data of these simulations. 

Utilizing this circumstance, a reduction of the FEM 

simulation time can be achieved as shown below. 

A. Finite Element WRIM Model for Parameter 

Identification 

The FEM model of the WRIM under investigation is 

shown in Fig. 1. This three phase machine has three 

magnetic pole pairs. The rotor and stator coil systems are 

in Y-connection with isolated star points. Furthermore, all 

coils are modeled as stranded coils, skin effect is not 

considered. Due to the symmetry of the machine, only a 

third of the geometry is modeled, decreasing the number 

of finite elements and thus the simulation time. Periodic 

boundary conditions are used at the cutting planes. Rotor 

and stator are coupled with constraint equations, thus no 

conforming mesh is needed, and the rotation of the rotor 

is taken into account [9]. 

The number of finite elements is 16128 and the 

number of DOFs is 48 025. A single magneto-static FEM 

simulation with ANSYS 12.1 needs approximately 21 

seconds on a computer with Intel Core2 Quad CPU 

(Q9400) and 8GB RAM. During this simulation, in 

average 20 nonlinear iterations are needed. Simulation 

time and number of iterations depend on the operating 

point. 

For a test example of nine sampling points for each 

state current and 15 different rotor positions, 98 415 


FEM simulations have to be carried out. The overall 

calculation time is approximately 574 hours. In order to 

reduce the computation time, the approach described 

below and called method of constant magneto-motive 

force (MMF) is proposed. 

Figure 1: FEM model of wound rotor induction machine with rotor and 

stator coil system. 

B. Method of Constant Magneto-Motive Force 

A reduction of the FEM simulation time can be 

achieved in general by reducing 

o the number of finite elements, 

o the number of simulations or 

o the number of nonlinear iterations. 

A reduction of the number of elements decreases the 

model quality and a reduction of the total number of 

simulations decreases the quality of the interpolation. 

However, a reduction of the number of nonlinear 

iterations reduces the simulation time without decreasing 

the model quality. Nevertheless, a direct manipulation of 

this quantity is not possible because it depends on the 

nonlinear behavior of the material used in the model. 

However, the material properties of a prior simulation can 

be used as initial guess for the new simulation. 

This can be done for several magneto-static simulations 

although they are independent from each other under the 

assumption that the saturation state between these 

simulations is approximately the same. This can be 

accomplished if the magneto-motive force between 

these simulations does not change significantly. 

If the same reference system is used for the rotor and 

stator current transformation, then can be easily 

calculated component-by-component using 

S R 

Θα i i 

α α 

= = 

NS S + NR 

R 

, (6) 

Θβ iβ iβ 

with the number of windings on the stator side NS and on 

the rotor side NR. So, a constant MMF can be reached by

keeping the MMF components and constant as 


R 

 

S 

 

 

S 

 

R 

 

 

 

Figure 2: Several parameter setups (different stator and rotor currents) 

with constant total magneto-motive force. 

This leads to a simple algorithm using a special 

simulation order for the different parameter setups. 

S 

i 

 

S 

i 

Rot 

R 

i 

R 

i 

Figure 3: a) Blockdiagram of algorithm with five independent loops for 

each state variable. b) Blockdiagram of constant MMF algorithm 

Figure 3a shows the straightforward approach without 

any optimization. All loops are independent, the loop 

order is arbitrarily and massive parallelization is possible. 

Figure 3b shows the improved constant MMF 

algorithm. Instead of five independent loops for each state 

variable, this algorithm consists of two coupled loops for 

each of the orthogonal - and -components and one 

independent loop for the rotor position. The outermost 

loops define the MMF-vector component-by-component, 

the innermost loops change the rotor and stator current 

ratio for the relevant components. By this coupling, the 

step sizes of the state variables are also coupled and 

cannot be chosen freely. Nevertheless, a high number of 

simulations with constant MMF can be carried out. 

The variation of the rotor position is done between 

these loops. This is suggested because the saturation state 

between adjacent rotor positions does not change 

significantly. However, this loop can be easily made the 

outermost one, allowing a parallelization for each rotor 

position. 

 

 

Rot 

S R 

Ni Ni S R 

S R 

Ni Ni S R 


A direct comparison between the two algorithms in 

Fig. 3 is shown in Fig. 4, highlighting the benefits of this 

simple algorithm. 

Figure 4: Comparison of the number of nonlinear iterations for 100 

magneto-static FEM simulations with and without constant MMF. 

The number of nonlinear iterations has been reduced 

approximately by a factor of four, the simulation time by 

a factor of three. This lower improvement for the 

simulation time can be explained by computational costs 

during the simulation setup and initialization. For the test 

example, the total simulation time has decreased to 190h 

from 574h, without any parallelization. 

IV. QUINT CUBIC SPLINE PARAMETER CALCULATION 

As shown in [8], a quint cubic spline interpolation 

method is well suited to be applied in the PPV-WRIM 

model, because it allows a continuous interpolation of the 

function value as well as the first partial derivatives of the 

function defined by the LUT. Nevertheless, the number of 

parameters per segment is 1024 and this results in a high 

memory demand for the LUTs holding these spline 

parameters. 

For the test example, the memory demand per LUT is 

approximately 480MB. Thus, approximately 3.3GB are 

needed in total for the PPV-WRIM model. However, this 

is not a problem for state of the art computers and 

moreover it is not necessary to load all data into the 

RAM. But the calculation of the parameters is very 

demanding, since their number is as high as 62.9 millions 

(Table III). 

Nevertheless, under the assumption of a regular and 

orthogonal grid and with the continuity conditions for 

quint-cubic splines utilized, a fragmentation of this 

system of equations is possible, allowing a fast 

calculation of the spline parameters with less memory 

demand and a parallelization capability. In this section, 

this method is presented and explained in detail. 

A. Quint-cubic spline interpolation 

A quint-cubic spline interpolation is a piecewise third 

order polynomial interpolation in five dimensions as 

shown in (8). C 1 -continuity (continuity of the function 

value and its first order partial derivatives) at the segment 

boundaries can be forced by the right choice of continuity 

conditions. Within the segments all derivatives are 

continues due to the properties of polynomials. 

Each segment in this sense is a five dimensional (5D)

hyper-cuboid with 32 corners and 10 adjacent segments 

sharing a four dimensional (4D) hyper-cuboid. The 

number of such segments NSeg depends on the number of 

sampling points of each coordinate variable x1, x2, x3, x4, 

x5 and can be calculated as 

NSeg Ni1. (7) 

ix , x , x , x , x 

1 2 3 4 5 

The quint-cubic spline is defined as 

x : x1, x2, x3, x4, x5, 

3 3 3 3 3 

i j k l m 

f xaijklmx 1x2x3x 4x5 , 

(8) 

i0 j0 k0 l0 m0 

with the spline parameters aijklm of the segment fulfilling 

L U L U L U 

x1 x1 x1 , x2 x2 x2 , x3 x3 x3 

, 

(9) 

L U L U 

x4 x4 x4 and x5 x5 x5 

, 

L L L L L 

with the lower segment boundaries x1 , x2, x3, x4, x 5 , the 

U U U U U 

upper segment boundaries x1 , x2 , x3 , x4 , x 5 and the 

local coordinates x1, x2, x3, x4, x5 

defined by 

L L L 

x1 x1x1 , x2 x2x2, x3 

x3x3, (10) 

L L 

x4 x4x4 and x5 

x5x5. The first order partial derivative with respect to x1 can 

be interpolated with 

3 3 3 3 3 

f i1j k l m 

fx1 ia ijklmx1 

xxxx 2 3 4 5 (11) 

x1 i1 j0 k0 l0 m0 

and in a similar manner those with respect to the other 

variables. 

Linear extrapolation or a periodic behavior at the 

domain boundaries can be easily realized using 

appropriate additional constraints [11]. 

B. Continuity conditions for quint-cubic splines 

The choice of the continuity conditions for the spline 

parameter determination is important for the continuity at 

the segment boundaries. C 1 -continuity even at the 

segment boundaries can be achieved for tri-cubic splines 

if the function value f, the first order partial derivatives 

fx1, fx2, fx3, and all higher order pure mixed derivatives 

fx1x2, fx1x3, fx2x3 and fx1x2x3 are continuous at the corners of 

each cuboid-seqment. This prerequisite is proven in [10] 

for tri-cubic splines. 

However, this proof can also be used for higher 

dimensional splines. This leads to the conclusion that for 

C 1 -continuity, the same prerequisites are necessary. In the 

case of quint-cubic splines, these are the function value, 

the five first order partial derivatives and all 26 higher 

order pure mixed derivatives. These are in sum 32 

equations per corner and in total 1024 constraints for 

1024 parameters per segment. It can easily be proven that 

these equations are linear independent. 

C. Continuity of f, fx1, fx2, fx3, fx4 and fx5 on the segment 

faces 

For the interpolation of the partial derivatives is 

necessary to show their continuity also on the segment 

faces. Therefore, let us assume a regular and orthogonal 

grid for the sampling variables and two adjacent segments 


S 1 and S 2 that share a face with x1=const. Without loss of 

U 

generality, the face defined by x 1 is used for S 1 and for 

S 2 L 

the face defined by x 1 is used. On both faces, the 

quint-cubic splines become the quad-cubic splines f S1 and 

f S2 below: 

3 3 

S1U L 

i 

j k l m 

ijklm 1 1 2 3 4 5 

jklm , , , 0 i0 

S1 quad-cubic spline parameter bjklm 

S2 

3 

 

jklm , , , 0 

j k l m 

0jklm 2 3 4 5 with: 

S2 

jklm 0jklm 

3 3 

S1U L 

i1 

j k l m 

x1 ijklm 1 1 xxxx 2 3 4 5 

jklm , , , 0 

i1 

S 1 

quad-cubic spline 

parameter b 

jklm 

S2 x1 

3 

 

jklm , , , 0 

j k l m 

 

1jklm 2 3 4 5 

S2 

with: 

jklm 1jklm 

f a x x x x x x 

f a x x x x b a 

f i a x x 

f a x x x x b a 

(12) 

For continuity of f, fx2, fx3, fx4 and fx5 at this face, it is 

sufficient that the quad-cubic splines f S1 and f S2 are equal. 

S1 

Therefore the all spline parameters b jklm of f S1 must be 

S 2 

equal to the corresponding parameter b jklm of f S2 . 

This equality can easily be shown, since both of these 

quad-cubic splines fulfill the same 16 constraints 

f, fx2, fx3, fx4, fx5, fx2x3,..., f x2x3x4x5 in all 16 involved 

nodes and these 256 equations for 256 unknowns are 

linearly independent. Therefore, the solution is unique 

and the two splines are the same. 

For continuity of fx1 at this face, the two additional 

S1 

S 2 

quad-cubic splines f x1 

and f x1 

representing the partial 

derivative with respect to x1 on this face, must be equal. 

This can be proven in a similar manner as for the other 

constraints fx1, fx1x2, fx1x3, fx1x4, fx1x5,..., f x1x2x3x4x5 per 

node. 

Furthermore, the same proof can be used for the other 

faces. Thus, C 1 -continuity for the quint-cubic spline 

interpolation has been shown for a regular and orthogonal 

grid. 

D. Segmented parameter calculation 

The proof of continuity shows that f, fx1, fx2, fx3, fx4, fx5, 

fx1x2, fx1x3,…, fx2x3x4x5 and fx1x2x3x4x5 must be continuous in 

each node. If all of those 32 conditions per node can be 

determined in advance, it is not necessary to assemble a 

single system of equations for all segments. Instead, each 

segment could be solved independently. This will lead to 

NSeg systems of equations with 1024 unknowns each and 

furthermore allows parallel evaluation. 

Under the assumption of a regular and orthogonal grid 

for the sampling data, only one parameter changes along 

each edge of the segments. Therefore, the quint-cubic 

spline becomes a normal cubic spline 

i 

f x aˆ x 1 

2, x3, x4, x i x 

5const 

i0 

3 3 3 3 

j k l m 

i 

ijklm 2 3 4 5 

j0 k0 l0 m0 

with aˆ a xxxx, 

3 

 

(13)

as shown for an edge with constant values for x2, x3, x4 

and x5. The resulting cubic splines along the straight lines 

with constant values for x2, x3, x4 and x5 could be 

evaluated independent of each other. This can also be 

done for all other straight lines where only one parameter 

changes. This leads to a number of NCSP simple cubic 

splines, where 

N N with M : x 

, x , x , x , x . 

(14) 

 

CSP j 

iM jM/ i 

1 2 3 4 5 

With these cubic splines, all first order partial 

derivatives in each node can be evaluated as for x1 below: 

3 

i 1 

f x i aˆx 

. (15) 

 

x1 1 i 1 

i1 

These derivatives can be used in a similar manner for 

determining all higher order mixed derivatives. 

This approach needs a high number of cubic spline 

determinations, but all these systems of equations have 

only three unknowns per segment and thus in total 

3Nx1 1 

unknowns for the cubic splines along the x1direction, 

for example. Furthermore, the system matrices 

for all splines in one direction are equal. This leads to a 

single system of equations with many right hand sides that 

can be solved very effectively. 

E. Dimensional recursive approach 

Obviously, not all 1024 parameters per segment are 

unknown. For example the constant parameter a00000 of 

each segment is equal to the function value in the segment 

L L L L L 

reference node x1 , x2, x3, x4, x 5 where all local 

coordinates are zero 

L L L L L 

a00000 f x1, x2, x3, x4, x5 

(16) 

and all derivatives in the reference node can be directly 

identified as parameters of the quint-cubic spline 

L L L L L 

a10000 fx1x1, x2, x3, x4, x5, 

L L L L L 

a01000 fx2x1, x2, x3, x4, x5, 

(17) 

Nevertheless, most of the parameters per segment must 

still be calculated. However, the construction of the quintcubic 

spline by using a reference node can be easily 

combined with the continuity conditions. This leads to a 

dimensional recursive approach allowing a well 

structured determination of all parameters. 

There are five faces of the segment including the 

reference node. For each of these faces, one coordinate is 

constant zero, resulting in a quad-cubic spline, as shown 

for x1=0: 

3 3 3 3 

j k l m 

f 0, x2x3x4x5a 0 jklm x2x3x4x5 . (18) 

j0 k0 l0 m0 

All parameters of this quad-cubic spline are also 

parameters of the quint-cubic spline as already indicated 

by the parameter indices in (18). Furthermore, the first 

order partial derivative with respect to x1 for the face 

x1=0 leads to a quad-cubic spline 

3 3 3 3 

j k l m 

fx10, x 2x3x4x5a1jklmx 2x3x4x5 . (19) 

j0 k0 l0 m0 


All of these parameters can be also identified as quintcubic 

spline parameters. Doing this also for the other four 

coordinates, leads to ten quad-cubic splines. Each of them 

is defined by the continuity conditions of the 

corresponding nodes. Finally, 992 quint-cubic parameters 

per segment can be determined by this method, only the 

32 parameters aijklm with i, j, k, l, m 2,3 have to be 

solved for separately. This is due to the segment diagonal 

U U U U U 

node x1 , x2 , x3 , x4 , x 5 , 

since this node is not one of 

the determined quad-cubic splines. 

For a quad-cubic spline, this approach can be used in 

the same manner, leading to eight tri-cubic splines and an 

additional system of equations of 16 unknowns for the 

diagonal node per segment. The further dimensional 

recursion is shown in Fig. 5. 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 5: Dimensional recursive approach for quint-cubic spline 

parameter determination. 

It is pointed out that the function calls shown in Fig. 5 

are multi-function calls. This means that each call has to 

be multiplied with the number of sampling points for the 

corresponding dimension to get the real number of single 

function calls. As an example, the number of single quadcubic 

spline function calls NQ4 can be calculated as 

NQ4 2Nx1Nx2 Nx3 Nx4 Nx5, 

(20) 

i.e. two multi-function calls per dimension. 

Furthermore, there is also some redundancy within this 

recursion. It is not necessary to call 80 multi-function 

calls for tri-cubic splines as Fig. 5 hypothesizes. Some of 

them are equal for different quad-cubic splines. This can 

easily be shown by the fact that each quad-cubic spline is 

defined by 256 parameters and each of them is also a 

parameter of the quint-cubic spline. However, the ten 

quad-cubic splines define only 992 quint-cubic 

parameters. 

TABLE I 

RECURSIVE MULTI-FUNCTION CALLS 

QuintQuadTriBi- Cubic 

cubiccubiccubiccubic 1 10 40 80 80 

Table 1 shows the actual number of multi-function

calls during this recursive approach. Table 2 shows an 

overview for different sampling rate setups and Table 3 

shows the corresponding calculation time and memory 

demand for an implementation in Fortran95. These 

calculations were carried out without parallelization on a 

computer with Intel Core2 Duo CPU (E8400) and 8GB 

RAM. 

TABLE II 

SETUP OVERVIEW FOR VARIOUS SAMPLING RATES 

Setup Nx1 Nx2 Nx3 Nx4 Nx5 NSample NSeg 

1 6 6 6 6 6 7776 3125 

2 8 8 8 8 8 32768 16807 

3 11 11 11 11 11 161051 100000 

4 12 12 12 12 12 248832 161051 

5 16 8 8 8 8 65536 36015 

6 16 9 9 9 9 104976 61440 

TABLE III 

SIMULATION TIME AND MEMORY DEMAND 

Setup Number of Calculation Memory 

Parameters time demand 

in Mio [s] [MByte] 

1 3.20 3,04 60,8 

2 17.21 15,61 131,3 

3 102.40 83,80 781,3 

4 1649.16 134,53 1258 

5 36.88 33,76 281,4 

6 62.91 54,39 480,0 

These setups illustrate the scaling of this approach. A 

comparison shows that the calculation time increases 

slightly slower than the number of segments does. 

The last setup corresponds to the test example of 

section III-A. For this example, Nx1=16, although there 

are 15 different rotor positions. This can be explained by 

the periodicity of the rotor position, i.e. the first sampling 

point is identical with the last one. 

V. CONCLUSION 

The method of constant magneto-motive force 

presented in this paper reduces the number of nonlinear 

iterations and thus the over-all simulation time for the 

FEM simulations. This is achieved by just changing the 

order of parameter setups and thus makes this method 

simply usable in every commercial FEM simulation tool. 

The revised quint-cubic spline parameter calculation 

method introduced in this work allows a very fast and 

memory saving pre-calculation of the needed spline 


parameters. This was achieved by a dimensional recursive 

approach that leads to a segmentation of the whole system 

of equations. The parameters can also be evaluated in 

parallel. 

These two improvements make the PPV-WRIM model 

approach introduced in [8] applicable to simulation tasks 

during the design stage of electrical drive chains. 

VI. ACKNOWLEDGEMENT 

This work has been supported by the Christian Doppler 

Research Association (CDG) and by the industrial partner 

AVL List GmbH. 

[1] 

REFERENCES 

Mohammed, O.A.; Liu, S.; Liu, Z.; , "Physical modeling of 

electric machines for motor drive system simulation," Power 

Systems Conference and Exposition, 2004. IEEE PES , vol., no., 

pp. 781-786 vol.2, 10-13 Oct. 2004 

[2] Liu, Z.; Mohammed, O.A.; Liu, S.; , "An improved physics-based 

phase variable model of PM synchronous machines obtained 

through field computation," Computation in Electromagnetics, 

2008. CEM 2008. 2008 IET 7th International Conference on , 

vol., no., pp.166-167, 7-10 April 2008 

[3] Kallio, S.; Karttunen, J.; Andriollo, M.; Peltoniemi, P.; 

Silventoinen, P.; , "Finite element based phase-variable model in 

the analysis of double-star permanent magnet synchronous 

machines," Power Electronics, Electrical Drives, Automation and 

Motion (SPEEDAM), 2012 International Symposium on , vol., 

no., pp.1462-1467, 20-22 June 2012 doi: 

[4] 

10.1109/SPEEDAM.2012.6264434 

Mohammed, O.A.; Liu, S.; Liu, Z.; Khan, A.A.; , "Improved 

physics-based permanent magnet synchronous machine model 

obtained from field computation," Electric Machines and Drives 

Conference, 2009. IEMDC '09. IEEE International , vol., no., 

pp.1088-1093, 3-6 May 2009, doi: 

[5] 

10.1109/IEMDC.2009.5075339 

Mohr, M.; Bíró, O.; Stermecki, A.; Diwoky, F.; ,”An Improved 

Physical Phase Variable Model for Permanent Magnet Machines”, 

submitted and accepted at ICEM, Marseille, France, 2012 

[6] Sarikhani, A.; Mohammed, O.A.; , "Development of transient FEphysics-based 

model of induction for real time integrated drive 

simulations," Electric Machines & Drives Conference (IEMDC), 

2011 IEEE International , vol., no., pp.687-692, 15-18 May 2011 

[7] Sarikhani, A.; Mohammed, O. A.; , "Non-linear FE-based 

modeling of induction machine for integrataed drives," 

[8] 

Computation in Electromagnetics (CEM 2011), IET 8th 

International Conference on , vol., no., pp.1-2, 11-14 April 2011 

Mohr, M.; Bíró, O.; Stermecki, A.; Diwoky, F.; ,” An Improved 

Physical Phase Variable Model for Wound Rotor Induction 

Machines”, submitted and accepted at CEFC, Oita, Japan, 2012 

[9] ANSYS® Academic Research, Release 12.1, Help System, “Low- 

Frequency Electromagnetic Analysis Guide”, ANSYS, Inc. 

[10] Lekien, F.; Marsden, J.; , “Tricubic interpolation in three 

dimensions,” Internat. J. Numer. Methods Engrg., 63 3 (2005), 

pp.455-471 

[11] Boor, C.D.; , “A practical guide to splines,” New York: Springer- 

Verlag, 1978, p39f., ISBN: 9780387903569


Post Insulator Optimization Based on Dynamic 

Population Size 

Peter Kitak, Arnel Glotic, Igor Ticar 

University of Maribor, Faculty of Electrical Engineering and Computer Science, Smetanova 17, SI-2000 Maribor, 

Slovenia 

E-mail: peter.kitak@uni-mb.si 

Abstract—This paper suggests the use of dynamic population size throughout the optimization process which is applied on the 

numerical model of a medium voltage post insulator. The main objective of the dynamic population is reducing population 

size, to achieve faster convergence. Change of population size can be done in any iteration by proposed method. The multiobjective 

optimization process is based on the PSO algorithm, which is suitably modified in order to operate with the principle 

of the optimal Pareto front. 

Index Terms—Dynamic population size, Insulation elements, Multi-objective optimization, particle swarm optimization. 

reductions in the population size are presented in fifth 

section. The sixth section is a conclusion with 

fundamental summarizing thoughts. 


Principal function of medium voltage insulator is 

electrical insulation of the conductive parts from the 

earthed parts of the device, and the mechanical fixing of 

equipment and conductors which have different electrical 

potential. This element often contains built-in capacitive 

voltage divider and thus is capable of performing voltage 

indication function. 

Particle swarm optimization method [1] is a very 

efficient algorithm and it is applied onto many 

engineering problems. In comparison to the original 

version, many modifications have been made to the 

algorithm, which has gained many improvements [2], [3]. 

Also, the PSO algorithm is extended into a multiobjective 

particle swarm MOPSO [4], [5], on the basis of a nondominant 

solution sorting (Pareto concept) [6]. Although, 

the PSO algorithm does not contain genetic operators in 

its fundamentals, it has been proven that the introduction 

of the mutation is very useful [7]. The presented method 

enables the enhancement of the solution space. The more 

recent modifications of the algorithm introduce the 

dynamic population size throughout the optimization 

process [8]. A variable [9], [10] or fixed [11] change of 

the population size, during the evolution, is possible. The 

method that includes variable changes of population size 

enables the change in any iteration. The method with 

fixed change of population is determined with a 

predefined step of iteration. 

Reduction of the population is desired, when the lasting 

time of the optimization needs to be shortened. However, 

the efficiency and robustness of the modified algorithm 

must not change. 

The remainder of this paper is organized as follows. 

Section two describes numerical model of medium 

voltage post insulator and also a multi-objective 

optimization model. Section three presents a classical 

Particle Swarm Optimization (PSO) algorithm and also an 

updated version of such algorithm with the Cauchy 

mutation operator. Section four describes a dynamic 

population PSO algorithm, where two procedures for 

population reduction are presented. Results of the 

classical optimization PSO algorithm and for all dynamic 

II. NUMERICAL MODEL AND OPTIMUM 

SIGNIFICATION 

Post insulator is used as a voltage indicator in medium 

voltage switchgear. Post insulators are installed in a 

switchgear device or in any other input element, where 

voltage is present. Fig. 1 shows parametrically written 3D 

numerical model of a post insulator. The metal fitting of 

the insulator (upper connection) for fastening to the 

conductive part of the upper side, is elongated with a 

special electrode of the divider, which has the same 

potential as the conductive part. The metal fitting for 

insulator fastening to the earthed part (lower connection) 

is situated at the bottom of the insulator. An electrically 

separated cylindrical metal mesh is mounted around this 

metal connection, which is the other divider’s electrode. 

Modeling of the mentioned switchgear element 

demands design regarding the exact value of capacitance, 

which is calculated from electric energy. Modeling also 

requires the lowest possible magnitude of electric field 

inside insulation material, because increased values of 

electric field lessen the life-time of these switchgear 

elements. Optimization provides electric field strength 

reduction on electrodes’ edges and on other critical points 

on its crossing between insulation and air. Switchgear 

stable performance requires that switchgear elements 

must be constructed to endure the highest voltages, such 

as lightning strike (125 kV). 

Requirements described above, which have an opposite 

tendency, are presented with two objective functions (fC, 

fE). Every objective function presents an individual 

electric characteristic with different quantities, therefore 

objective functions must be written relatively. Both 

objective functions are written with bell shaped fuzzy sets 

with the maximal value of 1. Function fE is presented in 

the Fig. 2a, whereas function fC is in Fig. 2b. The 

numerical calculation is based on FEM analysis, with 

solver of the EleFAnT program package [12].

upper electrode 

epoxy insulation 

cylindrical metal mesh 

lower electrode 

Figure 1: Post insulator illustrative model with 

optimization parameters. 

The entire model is parametrically written and 

described with eight parameters (Fig. 1). It is necessary to 

perform FEM calculation for each evaluation of the 

objective functions. 

f E 

a) 

f c 

1,2 

1 

0,8 

0,6 

0,4 

0,2 

0 

1 2 3 4 5 

1,2 

1 

0,8 

0,6 

0,4 

0,2 

p7 

p5 

p4 

p2 

E (MV/m) 

p8 

p6 

p3 

p1 

0 

54 59 64 69 74 79 84 89 94 

b) 

C (μF) 

Figure 2: Determination of objective function: a) fE, b) fC. 

PSO algorithm [1] has been used as a multi-objective 

optimization algorithm. Among other methods, the 

weighted sum method is used to solve the multiobjective 

optimization problem. Equation (1) describes how the 

functions fE and fC are merged into a unified objective 

function f 

f wEfE wCfC (1) 

where wE and wC are the weights of the individual 

quantities. 

The weighted sum method requires special attention 

when selecting the objective functions weights, which 

enable transformation to optimization with a composed 

single objective function. This method also requires a 

detailed knowledge of the applicative problem. Knowing 

the presented problem, according to (1), the authors have 

selected the following weights: wE = 0.6 and wC = 0.4. 


III. PSO ALGORITHM AND CAUCHY MUTATION 

The PSO algorithm is placed in a population-based 

stochastic search technique, that imitates social behavior 

of the birds while they fly and does not contain genetic 

operators. Instead of the genetic operators, the population 

members are exposed to the cooperation between each 

other, at the same time, they compete with each other 

throughout generations. 

Each and every particle adjusts its flying ability to the 

leading particle - the best individual. Each particle of the 

population, which represents a possible solution to the 

problem, is treated as a point in the D-dimensional space. 

The i th particle is presented as xi ( xi1, xi2,..., xiD) 

. The 

best former position (position, which gives the best result 

in the previous iteration) of the each and every particle is 

stored and presented as pi ( pi1, pi2,..., piD) 

. The velocity 

of the i th particle is presented as vi ( vi1, vi2,...., viD) 

. 

The velocity changes vi and the new position xi of the 

i th particle changes in accordance with the (2) and (3): 

vi( t1) wvi() t c1rand() ( pi() t xi()) t 

(2) 

c2Rand() ( pg( t) xi( 

t)) 

x( t1) x( t) v( t 

1) 

(3) 

i i i 

where t indicates the iteration, c1 and c2 are positive 

constants, rand() and Rand() are random functions of the 

dimension [0,1]. Index g represents the position of the 

best particle among other particles from the optimization 

process. Equation (2) is used for calculation of the new 

particle velocity on the basis of the previous particle 

velocity and the distance between its instantaneous 

distance and distance of the leading particle. Equation (3) 

represents a flight of the particle towards a new position. 

When the new population is entirely formed, the 

algorithm is being carried out until the interruption 

criterion is reached. Two approaches are used in this 

paper: a classical approach with the static population 

(number of the population members is always the same) 

and a dynamic approach, where the population changes 

the number of members throughout the optimization 

process. The quality of each particle is evaluated on the 

basis of the defined objective function. 

With the intention to prevent too fast convergence and 

consequentially to trap into a local minimum, the classical 

PSO algorithm has been upgraded with the Cauchy 

mutation [13]. Cauchy mutation operator that is used in 

the PSO is determined with a weighted vector. 

1 NP 

W v , (4) 

i ji 

NP j1 

where vji is velocity of vector j th particle in the 

population. The best particle is mutated from (2) 

according to the following equation:

p () i p () i W N( X , X ) , (5) 

' 

g g i 

min max 

where N is Cauchy distribution function over the 

interval (Xmin, Xmax). 

IV. DYNAMIC POPULATION SIZE IN THE PSO 

ALGORITHM 

In this paper the optimization procedure considers two 

approaches for the dynamic population size, employed in 

the multiobjective optimization problem. The first 

approach is based on the gradual reduction of the 

population size by half (subsection 4A). In the second 

approach, a dynamic reduction of population size is 

proposed, which is described in subsection 4B. 

A. Gradual reduction of the population size by half 

The original idea is presented by Brest [11] and 

proposes gradual reduction of the population size by half 

in each block of a predefined iteration number. This 

means that the reduction is not applied throughout all 

iterations. Fig. 3 shows the example where the population 

reduction has been carried out four times and the 

coefficient that defines the reduction is pmax = 4. In each 

reduction step, the population is reduced by a half in 

comparison to its former size. 

p=1 

p=2 

p=3 NP/4 

p=4 NP/8 

NP/2 

NP 

Figure 3: Schematic presentation of the population 

reduction. 

The stopping criterion for the optimization process is a 

predefined number of function evaluations maxnfeval. In 

relation to the population size reduction, there are two 

possibilities to determine the size of iteration blocks iterp. 

First possibility is an equal number of function 

evaluations throughout each single iteration block. 

Therefore the number of iterations is defined as: 

maxnfeval 

iterp 

 

pmax NP 

p 

The second possibility offers a constant number of 

iterations iterp, therefore the number of function 

evaluations for each reduction block is: 

(6) 

nfeval NPp iterp 

(7) 

For easier explanation, the Table I shows values for 

number of objective function evaluations (NP times iterp). 

These values are valid for individual population size 

blocks with a constant number of iterations iterp = 10 and 

the number of population reduction pmax = 4. 


TABLE I 

RUN DATA, MAXNFEVAL=1200, PMAX=4 

p 1 2 3 4 

NP 56 28 14 7 

iterp 10 10 10 10 

NP x iterp 560 280 140 70 

Selection procedure is based on the idea from the 

selection mechanism in DE optimization algorithm [11]. 

Individual from the first half of the population xi(t) and 

the suitable individual from the second half xNP/2+i(t) are 

compared based on the corresponding objective function 

values. Afterwards, the individual with a better objective 

function value takes the position i and therefore becomes 

the member of the new and reduced population. After the 

last step of selection is performed, the new and reduced 

population is obtained 

xNP/2 

i( t) if f( xNP/2i( t)) f( xi( t)) 

xi( t1) 

 

xi( 

t) 

other 

B. Dynamic reduction of population size in individual 

iteration 

Alteration of population size through an optimization 

process is realized based on objective function 

evaluations, respectively according to (9): 

avr i best 

NP() t NPmax 

f( xmax 

) 

(8) 

f ( x ) f( x ) 

, (9) 

where the favr(xi) is average objective function value of 

the observed population. f(xbest) and f(xmax) are the 

objective function values of the best and worst particle 

ever found up to the observed iteration. NPmax is initial 

(max) population size. 

As the optimization algorithm approaches to optimal 

solution, the value of the objective function alters. 

Generally, the population’s average value of the objective 

function value is getting smaller along with iteration 

number. However, this does not hold true for all 

iterations, because the particles move also through the 

non-promising areas of the search space. Because of this, 

the population size is, according to (9), generally 

reducing its size; however there are also iterations, where 

the population size has been extended. Each population 

extension in individual iterations appear usually when the 

average objective function value is increased. The 

missing particles are obtained from the set of particles 

that have been discarded on account of population 

reduction in previous iterations. This improves the 

algorithms reliability. 

V. RESULTS 

Optimization processes with the PSO algorithm are 

performed under the following settings: c1=0.5, c2=1.2, 

w=0.8. Maximal number of iterations for all calculations 

is maxiter = 40.

The optimization results are shown in Table II, where 

first two examples are showing results obtained by 

standard PSO algorithm and different size of population, 

third one showing results of standard PSO algorithm 

upgraded with Cauchy mutation and last two are showing 

results obtained by using dynamical population size in 

standard PSO algorithm. This paper presented two 

different concepts of dynamical population size, gradual 

reduction and dynamic reduction proposed in section IV. 

TABLE II 

OPTIMIZATION RESULTS OBTAINED WITH PSO ALGORITHM 

description min f NP maxnfeval 

standard PSO 0.345 30 1200 

standard PSO 0.328 56 2240 

standard PSO + mutation 0. 326 56 2240 

dyn. populated PSO 

0. 326 56/28/14/7 1050 

(gradual reduction) 

dyn. populated PSO 

(proposed reduction) 

0. 326 

max 56 

min 10 

829 

Optimization process convergences for all mentioned 

examples in Table II are shown in Fig. 4. 

Objective function value 

0,6 

0,55 

0,5 

0,45 

0,4 

0,35 

standard PSO (NP=56) 

standard PSO (NP=30) 

standard PSO + mutation 

dyn. populated PSO (gradual reduction) 

dyn. populated PSO (proposed reduction) 

0,3 

0 10 20 

Iteration 

30 40 

Figure 4: Objective function values of PSO algorithm 

during the optimization process 

Algorithm with using small population size has not 

reached global solution, because of stuck in local 

optimum. Global solution can be reached by increasing 

population size which leads increased number of function 

evaluations and longer computation time. By using 

proposed dynamical population size algorithm achieved 

global minimum with decreased number of function 

evaluation and computation time. 

Size of population 

60 

50 

40 

30 

20 

10 

Standard procedure with a static population size 

Dynamically populated PSO 

(proposed reduction) 

Dynamically populated PSO 

(gradual reduction) 

0 

0 10 20 

Iteration 

30 40 

Figure 5: Changing of population size during the 


Changing population size along the optimization process 

is shown on Fig. 5 – for the gradual reduction, proposed 

reduction and fixed population size (static size). 



Results show comparison of the optimization process for 

different population size reduction methods. Reduction of 

the population is desirable, when computation time 

should be decreased and although efficiency and 

robustness of algorithm should not be changed. 

The important impact of proposed PSO algorithm with 

using dynamical population size can be seen in decreased 

number of function evaluation. 

In each iteration is tendency to decrease the population 

size. However, the population number can be also 

increased by adding the new members, which refreshes 

the population. Therefore, the algorithm’s ability to 

search the minima is increased. It is important, already at 

the beginning, to select the appropriate, respectively 

enough large population. Therefore, the global search of 

environment is enabled. Smaller population size is 

sufficient just for local search solutions. 

[1] 

REFERENCES 

J. Kennedy and R. C. Eberhart, “Particle swarm optimization,” 

Proc. IEEE International Conference on Neural Networks, Vol. 

IV, Piscataway, NJ, pp. 1942-1948, 1995. 

[2] S.L. Ho, Y. Shiyou, Ni Guangzheng and H.C. Wong, “A particle 

swarm optimization method with enhanced global search ability 

for design optimizations of electromagnetic devices,” IEEE Trans. 

on Magn., vol. 42, no.4, pp. 1107-1110, 2006. 

[3] G. Toscano Pulido and C.A. Coello Coello, “Using clustering 

techniques to improve the performance of a particle swarm 

optimizer,” Proceedings of Genetic and Evolutionary 

[4] 

Computation Conference, Seattle, WA, pp. 225-237, 2004. 

L. dos Santos Coelho, H.V.H. Ayala and P. Alotto, “A 

Multiobjective Gaussian Particle Swarm Approach Applied to 

Electromagnetic Optimization,” IEEE Trans. on Magn., vol. 46, 

no.8, pp. 3289-3292, 2010. 

[5] L. dos Santos Coelho, L.Z. Barbosa and L. Lebensztajn, 

“Multiobjective Particle Swarm Approach for the Design of a 

Brushless DC Wheel Motor,” IEEE Trans. on Magn., vol. 46, 

no.8, pp. 2994-2997, 2010. 

[6] U. Baumgartner, C. Magele and W. Renhart, “Pareto optimality 

and particle swarm optimization,” IEEE Trans. on Magn., vol. 40, 

no.2, pp. 1172-1175, 2004. 

[7] L. Jize, S. Ping and L. Kejie, “A Modified Particle Swarm 

Optimization with Adaptive Selection Operator and Mutation 

Operator,” International Conference on Computer Science and 

Software Engineering CSSE 2008, Vol. 1, Wuhan, China, pp. 

1199-1202, 2008. 

[8] W. F. Leong and G. G. Yen "PSO-based multiobjective 

optimization with dynamic population size and adaptive local 

archives", IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, 

no. 5, pp.1270 - 1293 , 2008. 

[9] M. Greeff and AP. Engelbrecht, “Dynamic multi-objective 

optimisation using PSO,” Studies in Computational Intelligence, 

(Series Ed: Kacprzyk, Janusz), pp. 105-123, 2010. 

[10] G. G. Yen and W. F. Leong "Dynamic multiple swarms in 

multiobjective particle swarm optimization", IEEE Trans. Syst., 

Man, Cybern. A, Syst. Humans, vol. 39, p.890 - 911, 2009. 

[11] J. Brest and M. Sepesy Maucec, “Population Size Reduction for 

the Differential Evolution Algorithm,” Applied Intelligence, 29(3) 

pp. 228–247, 2008. 

[12] Program tools ELEFANT. Graz, Austria: Inst. Fundam. Theory 

Elect. Eng., Univ. Technol. Graz, 2000. 

[13] H. Wang, Y. Liu, S. Zeng, H. Li, and C. Li, "Opposition-based 

Particle Swarm Algorithm with Cauchy Mutation", Proc. of the 

2007 IEEE Congress on Evolutionary Compulation, 2007, pp. 

4750-4756.


Simulation of the Absorbing Clamp Method for 

Optimizing the Shielding of Power Cables 

Szabolcs Gyimóthy∗ ,József Pávó∗ ,Péter Kis † , Tomoaki Toratani ‡ , Ryuichi Katsumi ‡ and Gábor Varga∗ ∗Budapest University of Technology and Economics, Egry J. u. 18, H-1111 Budapest, Hungary 

† Furukawa Electric Institute of Technology, Vasgolyó u. 2-4, H-1158 Budapest, Hungary 

‡ Furukawa Electric R&D Center for Automotive Systems & Devices, Hiratsuka, Japan 

E-mail: gyimothy@evt.bme.hu 

Abstract—An efficient numerical simulation tool based on FEM is proposed, by which the EMC shielding effect 

characteristics of power cables can be predicted in the 30–1000 MHz frequency range, as if it would be measured by the 

Absorbing Clamp Method. The proposed simulation method is based on decomposition: a 2D axisymmetric RF FE model 

is used for describing the whole measurement setup, while a 3D quasistatic FE model is used for the symmetry cell of the 

shielding layer in order to capture the effect of its fine geometric details. The two models are coupled via the concept of 

the equivalent shielding layer obtained by homogenization. Comparison with real measurements show that the shielding 

characteristics can be reliably predicted this way, with some deviation in the low end of the frequency range though. This 

simulation tool can be applied in the design and optimization of braided cable shields to be used in automotive industry. 

Index Terms—EMC testing, cable shielding, homogenization, automotive industry 


The effective reduction of emitted radio frequency (RF) 

disturbances in electric vehicles –generated mainly by 

power semiconductors having high slew rates– becomes 

more and more important nowadays [1]. Partly for 

this reason, the network configuration of cars is being 

changed from unshielded single core multi wire harnesses 

into coaxial conductor layouts. Consequently, optimization 

of the shielding of such cables is a current topic. 

The absorbing clamp method (ACM) is a well known 

technique for measuring electromagnetic interference 

(EMI) generated by electric cables in the range 30– 

1000 MHz [2]. Nowadays it is commonly used in the 

automotive industry for testing electromagnetic compatibility 

(EMC) of braided shields and connectors of 

wire harnesses [3]. The measurement set-up is shown in 

Fig. 1. The central element of the device is the “clamp” 

consisting of a set of split lossy ferrite rings (see Fig. 2) 

and a sensing loop [4]. The shielding effect (SE) to be 

measured is –in essence– the ratio of the output signal 

of the unshielded cable to that of the shielded. 

This method became de facto an industry standard in 

many areas. For instance –in addition to RF compliance 

testing of cables– it is already used for the quantitative 

Fig. 1. ACM device for measuring the efficiency of cable shielding. 

evaluation and comparison of the performance of shields 

through their measured SE characteristics. Therefore, 

although there are other practical parameters by which 

one may characterize and optimize a shielding, the design 

cycle would be much more efficient if the SE curve of 

a shield prototype (as taken by ACM) could be directly 

predicted by numerical simulation. 

ACM is simple and relatively cheap, but it was not 

developed –in the late 60’s– with numerical modeling 

in mind. Although there are some theoretical studies on 

its operation [5], they are far from being applicable for 

quantitative prediction. Actually, the simulation of this 

measurement is quite challenging from the numerical 

point of view: a) the set-up consists of several components, 

among others ferrite; b) a wide frequency range is 

studied; c) the arrangement is “large” but has some very 

fine details. 

Several types of numerical models have been worked 

out to simulate this measurement, based on e.g. coupled 

transmission line system (TLS), finite element method 

(FEM) and method of moments (MoM), with little success 

though [6][7]. Surprisingly, none of them was able 

to catch even the main tendency of the characteristics. 

Anyone might conclude from the physical picture that 

the shielding effect gets better at higher frequencies, 

and actually the above mentioned computing models 

just confirmed this behavior. However, the real ACM 

Fig. 2. Absorbing clamp of type R&S R○ MDS-21 [4].

measurement of a typical braided cable shield usually 

results in a decaying SE characteristics as frequency 

increases (c.f. Fig. 13). Whether this behavior is intrinsic 

to the shielding, or rather just a “side-effect” of the 

measurement method, was a question. 

II. SUMMARY OF THE MODELING APPROACH 

It was realized that detailed 3D modeling of the measurements 

is not doable because of the enormous computational 

resources needed for correct analysis of the 

complicated arrangement. 

Our idea is to use decomposition and homogenization. 

Different models are used for the braided shield details 

and the overall set-up, respectively. The ACM measurement 

at higher frequencies tend to show little dependence 

on the larger environment (e.g. support, ground, walls, 

etc.). This suggests the use of a simplified axisymmetric 

2D finite element (FE) model of the arrangement, which 

can be analyzed efficiently. 

On the other hand, realistic cable shields do not 

exhibit such symmetry. To overcome this difficulty, we 

introduced the concept of equivalent homogeneous (bulk) 

shielding layer that may have frequency dependent complex 

conductivity, such that its shielding characteristics 

approximates that of the original braided shield. Also a 

method was developed by which the equivalent conductivity 

parameter can be identified. Although this latter 

requires a 3D FE model of the shield, the domain of 

computation extends to only a single symmetry cell of the 

geometry. As a result, this 3D analysis is manageable in a 

relatively moderate computing environment, too. Putting 

the equivalent shield with its properly selected parameters 

into the 2D model of the experimental set-up provides the 

simulated output of the measurement. 

III. THE 2D AXISYMMETRIC RF FE MODEL 

By omitting the support and the surroundings of the 

measurement device and taking the two reflector plates as 

disc-shaped, we get an axially symmetric arrangement, of 

which a 2D longitudinal section is enough to be considered. 

We used the RF Module of Comsol Multiphysics R○ 

in the application mode “Electromagnetic Waves / TM 

Waves / Harmonic propagation” and with model space 

dimension “Axial symmetry (2D)” [8]. 

The geometry of the 2D model can be seen in Fig. 3. 

Cable conductor and reflector plates are both modeled 

as perfect electric conductor (PEC) boundary conditions. 

The terminal impedance is modeled by means of a 

50 Ω coaxial cable section with non-reflecting boundary 

condition at its end. The arrangement is considered 

open in the radial direction, which is modeled by a 

cylindrical perfectly matching layer (PML). Finally, the 

excitation is realized as a 50 Ω coaxial cable, fed by an 

incident coaxial-mode TEM wave, which is prescribed as 

“port” boundary condition and characterized by the input 

voltage Uin. 


Fig. 3. The 2D axisymmetric FE model built in Comsol RF. 

The output signal is the induced voltage Uout of an 

assumed loop encircling the first ferrite ring (the one 

which is closest to the feed). The transfer function is 

defined as the ratio of output voltage to input, and 

normally its gain k versus the frequency f is used, 

k(f) = 20 log10 |Uout/Uin| [dB]. (1) 

The shielding effect (SE) of a given shield is defined here 

as the ratio of the output voltage measured for the bare 

cable core (i.e. for the cable with its shielding removed) 

to that one measured for the shielded cable. Provided the 

input voltage is kept fixed, the SE characteristics (given 

in dB, as function of the frequency) can be expressed as 

follows, 

SE(f) =kun(f) − ksh(f), (2) 

where “un” and “sh” stand for the unshielded and 

shielded configurations, respectively. 

Figure 4 is a snapshot about the circumferential component 

of the magnetic field, Hϕ, taken in the unshielded 

case. It can be observed how two waves –one along the 

line with a longer wavelength and one along the series 

of ferrite rings with shorter– are coupled with each other. 

The damping effect of the ferrite is also observable on 

the plot. Notably, this FE model performs well even in 

the reconstruction of the stray field of bulk aluminum 

shielding layers, for which SE can be as high as 130 dB. 

A. Parameter Dependency of the 2D Model 

We have carefully investigated the sensitivity of the 

computed results on several model parameters –including 

some of the numerical implementation– in order to filter 

out all possible sources of inconsistency between the 

model and the real measurement. First comes a list of 

those parameters which have little or no effect: value 

of the terminal impedance; the implemenation of open 

boundary (e.g. PML, absorbing boundary condition, etc.); 

fluctuation of the input voltage.

Fig. 4. Snapsot of the circumferential magnetic field. 

The latter needs some explanation. Since the device is 

not matched with the signal generator source, the voltage 

along the feeding cable becomes location dependent due 

to reflections, moreover this dependence is function of the 

frequency. In the FE model we consider a chunk of the 

feeding cable of fixed length, which implicitly defines the 

quantity Uin for the model. However, this may not be the 

same as its real (measured) counterpart. Fortunately, the 

computed SE curves do not show significant dependence 

on the “definition” of Uin, as simulations testified. 

Two factors that really affect the results are the permeability 

characteristics of the ferrite rings, as well as the 

assumption of axial symmetry. These are detailed in the 

next two subsections. 

B. Permeability of the Ferrite Rings 

The permeability of the ferrite material plays a dominant 

role in forming the SE characteristics predicted by 

the model, hence its accurate description is critical. Since 

permeability data for the given absorbing clamp were 

not available, we made measurements. The obtained frequency 

dependence of the complex relative permeability 

(real and imaginary parts) can be seen on the graph of 

Fig. 5. Note that the data above 100 MHz are extrapolated 

values. 

C. Limitations of Assuming Axial Symmetry 

Considering the field plot in Fig. 4 one can conceive 

that the ferrite clamp acts a waveguide and –at higher 


Relative permeability 

800 

700 

600 

500 

400 

300 

200 

100 

0 

10 7 

10 8 


real part 

negative imaginary part 

Fig. 5. Complex relative permeability of the ferrite material. 

frequencies– can give rise to higher order modes. Although 

with the use of coaxial feeding the axisymmetric 

mode(s) are deemed predominant, non-axisymmetric 

modes can appear as well wherever symmetry is violated 

in the cross section of the device. 

For studying this behavior we carried out the mode 

analysis for the cross section of the device using FEM. 

In order to break axial symmetry, a large grounded 

conducting plate –being parallel with the axis of the cable 

under test– was added to the arrangement. In addition 

to this, the full 3D FE model of the arrangement was 

analyzed. Note that an unshielded cable was examined 

in both cases for the sake of simplicity. 

Figure 6 compares the transfer functions obtained from 

the full 3D and the 2D axisymmetric model, respectively. 

Although several possible propagating modes exist towards 

1GHz, there is only a little deviation between the 

curves, mostly limited to the lower frequencies. From 

this we concluded that the effect of higher order nonaxisymmetric 

modes is negligible, and that the axisymmetric 

model is acceptable within this frequency range. 

Fig. 6. Comparison of the transfer functions computed by the 2D and 

3D FE models, respectively, for the case of an unshielded cable. 

10 9

IV. HOMOGENIZATION AND EQUIVALENT SHIELDING 

The goal is to replace the shield having complex geometry 

with a homogeneous cylindrical shielding layer, 

which is called hereinafter the “equivalent shield”. The 

task is to determine the parameters of the latter so that 

its shielding effect be the same as if it was measured 

on the real cable shielding. Of course, this equivalence 

is allowed to satisfy approximately, and only on (and 

outside of) an observation surface, i.e. at and beyond a 

certain distance from the shield (Fig. 7). 

3D Observation surface 1D 

Real texture 

Wire 

Shield 

Isolation 

Fig. 7. Illustration of the concept of homogenization. 

Equivalent 

homogeneous material 

Figure 8 demonstrates how the roughness of the stray 

field distribution is smoothing as we increase the distance 

taken from the shield. (the radial component of the 

electric field, Er, is plotted along a line parallel with 

the cable, at 1GHz). This smoothing behavior justifies 

the use of the homogeneous shield as replacement in the 

2D model. Note that the shield tested here is not braided 

but leaky (c.f. Fig. 10), hence for braided shields one can 

expect stronger homogenization effect. 

There are several parameters that can be varied in order 

to find an equivalence, like for instance the inner and 

outer radii of the layer, or the specific electric conductivity 

and magnetic permeability of the material. In order to 

simplify and regularize the inverse problem we decided 

to keep the geometry fixed, and choose non-magnetic 

material. Hence only the complex valued conductivity, 

σ, of the layer remained to be determined. The proposed 

values for the inner and outer radius of the equivalent 

shield are r1 =10mmand r2 =11mm, respectively. 

The observation surface is at the radius robs =23.5mm 

which coincides with the inner radius of the ferrite rings. 

A. Evaluation of the Scalar Conductivity Parameter 

For those shielding configurations which show certain 

symmetry in the ϕ direction, like the one on the left 

hand side in Fig. 9, the azimuthal components of the 

shielding currents are symmetric too. As a consequence, 

the axial component of the magnetic fields caused by 

those currents are compensated, and thereby vanish. 

This allows us to investigate only the “axial electric – 

azimuthal magnetic” field mode, and hence to describe 

the conductivity σ by a complex scalar value. 

If the current in the wire is given and the parameter σ is 

known, the electric and magnetic fields can be determined 

from Maxwell’s equations. Since the problem with homogeneous 

shielding is essentially one dimensional (see 


Fig. 8. Smoothness of the field at various distances from the shield. 

Fig. 7, right), the solution can be given analytically. We 

can describe the electric and magnetic fields, each, by 

one single component, 

E(r, ϕ, z, t) =Re Ez(r)e −jωt ez (3) 

H(r, ϕ, z, t) =Re {Hϕ(r)e −jωt eϕ (4) 

where time harmonic fields of angular frequency ω are 

assumed, and j denotes the imaginary unit. Using the 

quasi static approximation in the conducting region, 

one can easily derive the following Bessel’s differential 

equation from Maxwell’s equations: 

d2Ez 1 dEz 

+ 

dr2 r dr − jωμ0σEz =0, r1

• Hϕ is specified at r0 (this is equivalent to prescribing 

the total current of the wire), 

• both Ez and Hϕ are continuous at r1 and r2, 

• the asymptotic behavior of the fields for r →∞is 

known. 

We omit further details of the solution as they can 

be found in several textbooks on electromagnetism [9]. 

The closed formula expressing the magnetic field was 

implemented as a Matlab R○ function, where the input is 

the complex conductivity σ, and the output is the complex 

amplitude (phasor) of Hϕ at r = robs. 

We use the following procedure for the evaluation of 

the equivalent conductivity of the homogeneous shielding 

layer (let us denote it with σeq in the following): 

1) We create the 3D FE model of the real shielding 

together with an exciting wire centered in its axis. 

We compute the magnetic field at some selected 

frequencies between 30 − 1000 MHz. 

2) We take samples of the azimuthal magnetic field on 

the observation surface, and calculate its average. 

This is what we call the “observed magnetic field” 

and denote with Hϕ,obs. 

3) In an optimization loop we attempt to find σeq 

for which Hϕ (a function of σ) and Hϕ,obs match 

the best (this is carried out for each frequency 

separately): 

σeq =argmin|Hϕ(σ) 

− Hϕ,obs| (8) 

σ 

Some remarks on the procedure. Notably, in step 1) it 

is enough to take one symmetry cell of the arrangement 

as Fig. 10 demonstrates. We used the AC/DC module 

of Comsol Multiphysics R○ for computing the fields with 

quasi static approximation [10]. In step 2) we also verify 

whether the z component magnetic field, Hz itself, as 

well as the fluctuation of Hϕ on the observation surface 

are really negligible. Finally, in step 3) we used the 

Matlab R○ function fminsearch for the purpose. 

B. Evaluation of the Conductivity Tensor 

For shielding configurations like the spiral-structure in 

Fig. 9 on the right, the above described method is not 

suitable because the magnetic field is no longer pure azimuthal. 

In this case we can still use the anisotropic conductivity 

tensor in the equivalent homogeneous shielding 

layer to imitate a similar phenomenon. The conductivity 

tensor is assumed to have the following form: 

⎡ 

σ = ⎣ σrr 

⎤ 

0 0 

⎦ (9) 

0 σϕϕ σϕz 

0 σzϕ σzz 

That is, the r − ϕ and r − z cross-effects are supposed to 

have negligible contribution to the observable magnetic 

field. Moreover, σϕz = σzϕ is expected. 

To demonstrate the treatment of anisotropy we derive 

the equations for the field components in the conductive 

region. First, the governing Maxwell’s equations are 


Fig. 10. Symmetry cell of the shielded cable used for determining 

the σ parameter of the equivalent homogeneous shielding layer. This 

structure was inspired by leaky coaxial (LCX) cables. 

written in the quasi static approximation, in the time 

harmonic regime: 

∇×H = σE, ∇×E = −jωμ0H (10) 

Taking into account the obvious symmetries of the configuration 

with homogeneous shielding layer, i.e. ∂/∂z = 

0 and ∂/∂ϕ =0, the resolution of (10) written for the 

three cylindrical components is the following: 

r : 0 = σrrEr, 0=−jωμ0Hr (11) 

⎧ 

⎪⎨ − 

ϕ : 

⎪⎩ 

∂Hz 

∂r = σϕϕEϕ + σϕzEz 

− ∂Ez 

(12) 

= −jωμ0Hϕ 

⎧ 

∂r 

⎪⎨ 

1 ∂ 

z : 

r ∂r 

⎪⎩ 

(rHϕ) =σzϕEϕ + σzzEz 

1 ∂ 

r ∂r (rEϕ) 

(13) 

=−jωμ0Hz 

It can be seen that the radial components vanish, and 

also that σrr does not play a role. However, there are no 

longer separable Hϕ − Ez and Hz − Eϕ modes, as in 

the isotropic case. The solution of the equations (11-13) 

is not easy, but as the domain is one dimensional, fast 

solution method may be established. The algorithm given 

in section IV-A should slightly be modified here: 

1) The same as in the isotropic case, but we have 

to carry out the FE analysis for two different 

excitations: one in which Hϕ is prescribed at the 

wire surface (r = r0), and one in which Hz is 

prescribed there (whatever would be the physical 

meaning of the latter). These two solutions are 

marked with ( ′ ) and ( ′′ ) in the following. 

2) We take samples of the axial and azimuthal 

magnetic field on the observation surface from 

both FEM solutions, and calculate their average. 

This way we obtain the observed magnetic fields, 

, H′′ 

H ′ ϕ,obs 

ϕ,obs , H′ z,obs 

and H′′ 

z,obs respectively.

3) We attempt to optimize the components of σ as 

above. The generalization of the scalar case is 

straightforward: 

σeq =argmin 

σ 

 

H ′ 

ϕ − H ′ 

 

ϕ,obs + H ′′ 

ϕ − H ′′ 

+ H ′ z − H ′ 

 

z,obs + H ′′ 

z − H ′′ 

V. TEST RESULTS 

ϕ,obs 

z,obs 

 

+ 

 

(14) 

For testing the method we chose a simple shield structure 

(Fig. 11) inspired by leaky coaxial (LCX) cables. The 

shield is is made of aluminum; the inner radius of the 

tube is 10 mm; the wall thickness is 1mm. The shield 

has circular holes of 5mmradius in a regular distribution; 

there are 4 holes along the circumference. 

Fig. 11. Leaky aluminum shield used for testing the method. 

Since the geometry is symmetric with respect to the 

azimuthal (ϕ) direction, we are allowed to use the equivalent 

scalar conductivity (c.f. section IV-A). By solving 

(8) we obtained the σeq curves presented in Fig. 12. 

Note that these curves are not the only suitable ones, 

because the solution of (8) is not unique. However, we 

experienced that quite different σeq curves resulted in 

the same SE characteristics at the end, so they can be 

considered equally good in this respect. 

Figure 13 shows the SE curve computed by building 

the σeq characteristics of Fig. 12 into the 2D axisymmetric 

FE model. For comparison, the figure also shows the 

curve obtained by real measurement. Obviously, the SE 

Fig. 12. The computed equivalent complex conductivity, σeq. 


Shielding Effect (dB) 

60 

55 

50 

45 

40 

35 

30 

25 

20 

15 

10 7 

10 8 


measured 

simulated 

Fig. 13. Comparison of measured and simulated shielding effects. 

characteristics has been reliably predicted by the model, 

with some deviation in the low end of the frequency 

range. 


A numerical simulation method has been elaborated that 

can be used to predict the shielding effect of shields with 

various patterns. Using this tool the designer can predict 

the usability of a given shield construction. This simulation 

method has been thoroughly verified by theoretical 

considerations, numerical experiments, and also by some 

experimental data. In the authors’ opinion, the error of 

prediction at low frequency might be due to either the 

inexact knowledge of the ferrite permeability characteristics 

or the insufficiency of the 2D axisymmetric model. 

REFERENCES 

[1] M. Reuter, S. Tenbohlen, W. Köhler, and A. Ludwig, “Impedance 

analysis of automotive high voltage networks for EMC measurements,” 

in 10th Int. Symposium on Electromagnetic Compatibility 

(EMC Europe), York (UK), 26-30 Sept 2011. 

[2] A. Tsaliovich, Electromagnetic Shielding Handbook for Wired and 

Wireless EMC Applications, ser. Kluwer international series in 

engineering and computer science. Kluwer Academic, 1999. [Online]. 

Available: http://books.google.hu/books?id=4vl0S6fZo-IC 

[3] S. Miyazaki, S. Kihira, and T. Nozaki, “New shielding construction 

of high-voltage wiring harnesses for Toyota Prius – winning 

of Toyota Superior Award for cost reduction,” Sumitomo Electric 

Industries Technical Review, no. 61, pp. 21–23, Jan 2006. 

[4] Rohde & Schwarz R○ MDS-21 Absorbing Clamp – Data 

sheet. [Online]. Available: http://www2.rohde-schwarz.com/file/ 

MDS-21 EZ-24 dat en.pdf 

[5] D. Williams and S. Jones, “Time domain characterization and 

modelling of the absorbing clamp. a device for measuring radiated 

radio frequency power,” in Eighth International Conference on 

Electromagnetic Compatibility, 21-24 Sept 1992, pp. 149–159. 

[6] L. Fejérvári, “Simulation of wire harness radiation,” Furukawa 

Electric Institute of Technology, Budapest, Tech. Rep., March 

2009. 

[7] P. Kis, “Simulation of wire harness radiation,” Furukawa Electric 

Institute of Technology, Budapest, Tech. Rep., March 2010. 

[8] Comsol Multiphysics RF Module User’s Guide, COMSOL AB, 


[9] K. Simonyi, Foundations of Electrical Engineering: Fields, Networks, 

Waves. London: Pergamon, 1963. 

[10] Comsol Multiphysics AC/DC Module User’s Guide, COMSOL 

AB, November 2008. 

10 9


A Neural Network Approach to Sizing an Electrical 

Machine 

Steven Bielby, David A. Lowther 

Electrical and Computer Engineering Department, McGill University, 3480 University Street, Montreal, Quebec, 

Canada. H3A 2A7 

E-mail: david.lowther@mcgill.ca 

Abstract—The first stage in the design of an electrical machine, or an electromagnetic device, is usually referred to as 

“sizing”. It produces an approximate description (or design) of the desired device in terms of its major physical dimensions. 

This is traditionally performed using simple magnetic circuits or electric equivalent circuits. This paper proposes an 

approach based on a data base of device performance data and a neural network to estimate the values of the parameters 

necessary to provide a complete initial description of the device. 

Index Terms—Design, electrical machines, sizing, neural networks. 

Not only is the speed of the process an issue in that the 


design can take too much time, many of the parameters 

The design of any artifact is a process of searching an that are needed for a complete field solution of the device 

appropriate space at increasing levels of complexity. This have little or no effect on the main performance 

space is usually referred to as the design space. It is, parameters. For example, the magnitude of the torque in 

fundamentally, a high dimensional space relating the an electrical machine is dependent on the variation of the 

performance of a device to the values of a set of stored energy in the airgap of the device as the rotor 

parameters that describe the structure and physical changes position. The energy, in turn, depends on the 

properties of the device. The performance, of course, fields in the airgap and, to a first approximation, these 

might not be a single variable but several. For example, may be related to the current densities and the size of the 

the cost, size and weight as well as output or functional airgap. This information can be expressed through a basic 

capabilities might all be performance variables. The magnetic circuit. 

physical parameters could be the actual dimensions Consequently, the process usually employed for 

describing the physical structure; the material properties; relatively conventional structures relies on the knowledge 

or the external environment including the excitation of the designer and simple equivalent circuits for the 

sources, the mechanical loads, etc. Thus the design space initial steps. The design starts at an extremely coarse high 

can be extremely complex and the design process for a level and works downwards. The equivalent circuit 

device needs to be able to balance output objectives and model of an electrical machine provides a way of relating 

input specifications. The difficulty in design is that the electrical, mechanical and, possibly, thermal performance 

challenge posed by the specification is, in general, an of a device to the values of a relatively small number of 

inverse problem, i.e. the designer is asked to produce a components and thus is easy to work with as well as 

physical structure which will meet objectives which are providing fast initial iterations. However, it is not always 

the input to the design. For example, for an electrical easy to relate the equivalent circuit values to the physical 

machine, the design input might well be the torque structures themselves and the knowledge base of the 

required from the device over a particular speed range. In designer helps to bridge this gap at the early stages of the 

addition, there might be electrical constraints imposed by design process. Fig. 1 illustrates the hierarchical process 

the form of the power supply. 

involved where the initial search for a design solution 

While it might be possible to achieve the design of an takes place at a high level with a limited set of parameters 

electrical machine from first principles, i.e. starting from but a large space to explore. As the design space is 

the equations of physics and knowledge of the properties narrowed down, the number of parameters is expanded to 

of various materials, this involves working at the provide a more detailed examination of the local space. 

maximum detail of the device. To use this approach, the At each level of the process, the analysis performed 

problem requires the solution of the field equations and becomes more complex and, consequently, more 

these require that all the physical parameters are expensive. If design is considered to be an optimization 

identified and given values. This generates a huge design process, then the phases of exploration and exploitation 

space to work in. The approach has been demonstrated re-occur at each level. At some point, the “virtual” 

on simple models but is computationally extremely (computer based) design is terminated because an 

expensive [1]. Searching such a space without an initial increase in the level of detail will have no effect on the 

idea of where a solution might be is extremely slow on 

existing computational platforms. In addition, such an 

required accuracy of the performance parameters . 

approach has difficulty including economic, 

manufacturing, and other constraints on the design.

Figure 1. Hierarchical Design Process 

The starting point at the highest level can, clearly, 

affect the convergence and time taken to complete the 

process. In an industrial organization, determining the 

starting point is performed through one or both of two 

processes. The first involves a database of previous 

designs [2], while the second uses simple magnetic 

circuit models (mentioned earlier) which vary in their 

effectiveness. However, if little or no design experience 

exists and a magnetic circuit is ineffective, for example in 

the case of systems with non-linearity and eddy currents, 

these two approaches may fail. The approach suggested 

in this paper is to use a neural network to map the desired 

machine performance onto a set of parameter values for 

the device. 

II. THE CONCEPT OF SIZING 

Following from the above discussion, the first stage in 

the design process of an electrical machine is to estimate 

the approximate size of the device needed to meet a set of 

specifications. Traditionally, this is based on a few, 

relatively basic, rules. For example, the torque that can be 

delivered by an electrical machine is given in Equation 1: 

2 

T 0. 

5D 

L.( 

B. 

J ) 

(1) 

Where D is the outer diameter of the rotor, J is the 

effective stator current sheet in amps per meter of 

circumference, L is the length of the rotor, and B is the 

flux density crossing the air gap. 

How these quantities are created is the job of the rotor 

and stator structures. In a simple machine, the flux 

density in the airgap can be estimated from a basic 

magnetic circuit where the only component that provides 

a magnetic “resistance” to the magnetic flux is the airgap. 

The benefit of this approach is that it can provide 

approximate values for many of the key parameters in the 

machine design and it, in effect, locates the most likely 

area in the design search space for a solution. The levels 

of accuracy needed here are relatively low – the goal is to 

find a “ball park” estimate for the parameters so a 

solution within 10% or 20% of the real answer is good 

enough to begin the design process. The issue is speed – 

these results can be obtained very quickly and the 

exploration of a possible design space can be completed 

at a much reduced cost. 

However, given a need for only an approximate 


solution but to deliver it extremely quickly, there are 

several alternate paradigms which could achieve this on a 

relatively unsophisticated computing device. In modern 

terms, there are two approaches which can be used. The 

first is to generate a surrogate model [3]. This is, in 

effect, the approach taken by the equivalent circuit 

approach, i.e. the real model is replaced by a simplified 

structure which performs in much the same way. The 

effectiveness and accuracy of the surrogate can be 

controlled relatively easily. The relationship between the 

output performance and the input parameters is often 

referred to as the “Response Surface” [4] and the 

accuracy or fidelity of the response surface depends on 

the surrogate model chosen. An alternate approach is to 

generate a series of points on the response surface and 

then to develop a curve fitting or interpolation system to 

estimate other points on the surface. In this case, the 

response surface can be modeled in a way that estimating 

a new point and determining the values of the input 

parameters for this point can be achieved very quickly. 

This is a variant of the approach suggested in [5]. 

However, the development of the surrogate can be 

computationally expensive since full field solutions may 

be necessary. The gain, of course, over the simple 

magnetic circuit approach is that the synthesized initial 

design is likely to be much more detailed and to take into 

account more of the real behavior of the device than the 

simple assumption of perfect magnetic materials and only 

an air gap. 

The approach being proposed in this paper is a 

combination of the two “conventional” systems. The 

surrogate model is based on a neural network and an 

existing database of solutions is used to train the 

network, thus providing an approximation to the response 

surface. Such a system can deliver an initial estimate of a 

design with minimal computational effort and has the 

added advantage of improving its capabilities after each 

design as the new design can be added to the training 

database. 

III. NEURAL NETWORKS 

A neural network is an interconnected system of basic 

processing elements [6]. Each element performs a simple 

computation based on its inputs and produces an output. 

Each neuron “sees” a different weighted set of inputs and 

the outputs are combined to generate the output of the 

network. Fig. 2 illustrates the process.

Figure 2. Basic Neural Network Architecture 

The architecture shown in Fig.2. is often referred to as a 

“feed-forward network” in that the input data is fed in 

one direction through the network and the network 

operates synchronously. 

The operation of the network is controlled through the 

values of the weights on the neuron inputs and the 

combination of the neuron outputs. If a vector of input 

values, representing a point in a multi-dimensional space, 

is presented at the inputs, the network will respond with 

an appropriate output. In a sense, the network operates 

like an active read-only memory. The output from the 

memory system being determined not by addressing a 

particular location in the memory but based on the 

content of a memory location. Thus, neural networks are 

often referred to as “auto-associative” or “contentaddressable” 

memories. The feed-forward network is 

only one of several possible neural interconnections and 

is interesting in this application because it is able to 

interpolate between examples it has been shown. 

The process of defining the operation of a particular 

network is known as “training”. In this phase, the 

network is provided with a set of input vectors and the 

output corresponding to each vector. The weights seen by 

each neuron and their final combination into the output 

are adjusted to minimize the error between the required 

output and the one that is actually generated. This can 

often be a difficult process depending on the form of the 

neurons themselves. 

The neurons can be based around several different 

functions. Traditionally, neurons have used a paradigm 

based on a summing amplifier. Each neuron provides the 

weighted sum of its inputs which is then processed by a 

thresholding function. The outputs of all the neurons are 

then summed at the output. Each neuron operates over 

the entire input space. In this case, a neuron is said to 

“fire”, i.e. produce an output, when the weighted sum of 

the inputs exceeds some threshold value. Thus the 

evaluation of the weights requires the satisfaction of a set 

of inequalities and the solution is non-unique. In 

addition, in order to model sophisticated functions, 

several layers of neurons may be needed and this can lead 

to difficulties in the training operation. 


For the work described in this paper, neurons based 

on radial basis functions are used. In this case, the 

output function of the network is described by: 

Where ci represents the center of the area of interest of a 

single neuron and x is the position of the current input 

point in the parameter space being considered. Wi 

represents the trained weight of neuron i. 

The function, , is given by: 

2 

y 

 

2 

 

(3) 

( 

y) e 

Where controls the domain of influence of the neuron. 

The training process can thus determine the values of 

W and for each neuron. Each neuron thus has a local 

effect. The determination of the weights for a network 

based on these functions can be expressed as an 

optimization problem and the approach results in a 

network that is easier to train. 

Once trained, the network can reproduce the examples 

that it was shown. However, there is also an emergent 

property in that it is able to “generalize”, i.e. it can 

generate outputs for input vectors it has not “seen” 

before. The process of building a neural network can be 

considered similar to fitting a surface in a multidimensional 

space to a set of data points. In fact, there is 

some commonality here with methods used in meshless 

systems to evaluate field solutions [7]. The network can 

function extremely quickly since the individual neuron 

operations are computationally simple and it acts as a 

look-up table for the unknown surface. 

How well the network can match the input data and 

corresponding outputs and how good the generalization 

capabilities are depends on the network design. The 

number of neurons can be considered to be similar to the 

number of basis functions used to represent the surface. 

If too few are used, the network will have a problem 

training to the presented data with sufficient accuracy; if 

too many are used, the network will have difficulty 

generalizing and may generated large errors between the 

known data points (a sort of high frequency oscillation 

between the points). For this reason, the training set is 

generally split into two pieces: the first is used to train 

the network; the second, which has not been seen by the 

network during training, is used to test the generalization 

capabilities. This can then lead to a higher level process 

where the network architecture, i.e. the number of 

neurons used, is modified during the training process to 

try to improve the generalization performance. 

IV. THE PROPOSED SIZING PROCESS 

From the above, the process of sizing an electromagnetic 

device, in particular, an electrical machine, could be 

implemented using a neural network. This is based on the 

fact that the process of sizing is usually fairly limited, e.g. 

(2)

for a specific torque requirement and architecture of 

machine, determine the key diameter values and the air 

gap size. If several designs of a specific class of machine 

already exist, then a neural network can be trained on this 

data and the generalization capability will allow it to 

estimate the “size” of the new device. As stated above, 

the goal of sizing is not to produce a perfect solution to 

the design problem, rather it is to get within a reasonable 

range in the design space of a possible solution. Thus the 

system does not have to be highly accurate; an error of 10 

or 20 percent in the performance of the proposed design 

is probably acceptable since a conventional optimization 

system can take the design from that point to completion. 

The process of developing a neural network based 

sizing system is shown in Fig. 3. 

Figure 3. Sizing Network Development Process. 

Since the goal of the sizing process is to develop an 

approximate synthesized prototype, the database shown 

in Fig.3 is used primarily to identify the major features 

and parameter values. Hence, in fact, a database of 

existing designs is (a) probably too limited and will not 

cover the design space particularly well and (b) is 

unlikely to be structured to provide the information 

needed for sizing. Instead, a more controlled database can 

be constructed by using existing analysis programs. 

Using this approach, the database can be developed to 

provide effective coverage of the design space. In 

addition, certain parameters of the device, e.g. the 


number of poles, the maximum frequency, etc., can be 

fixed and thus the network can be trained on a subset of 

the machine design space. This lowers the dimensionality 

of the space and hence, simplifies the network and 

reduces the training time. It also simulates most existing 

sizing processes where certain key parameters are set in 

the specifications. In the event that these are not set, a 

higher level network can be developed to first make the 

choice of these key parameters before moving into the 

sizing process. 

V. A SIMPLE SIZING TEST 

Given the issues facing machines designers due to the 

costs of permanent magnets, a possible design scenario 

for demonstrating the effectiveness of the neural network 

approach is the replacement of a permanent magnet rotor 

with that for an induction machine while keeping the 

stator design constant. Thus the goal is to design a rotor 

structure that can produce a specific torque-speed 

performance. Note that, since the native torque-speed 

curve for an induction machine is very different to that 

for a permanent magnet design, the substitution is only 

possible with the additional use of power electronics and 

an external control system. 

Conventional sizing approaches, which work well for 

permanent magnet machines, are not very effective in 

dealing with induction machine sizing and somewhat 

more sophisticated models based around equivalent 

circuits are needed. Thus the induction machine is an 

ideal candidate for the process being described in this 

paper. 

The proposed system was tested on two different rotor 

architectures. The first was a drag-cup servo rotor where 

the rotor architecture is a conducting (copper) cylinder 

around a permeable (iron) core. The design parameters 

here are simple: just the thickness and radius of the 

conducting cylinder. The second design involved a 

squirrel-cage rotor which increased both the 

electromagnetic complexity of the problem and the 

number of design parameters. 

A. The Drag-Cup Rotor 

Fig.4 shows the basic design of the drag-cup rotor 

being considered. 

Figure 4. A Drag-Cup Rotor for an Induction Machine.

The typical torque-speed curve for this device is 


Figure 5. Typical Torque-Speed Curve for a Drag-Cup 

Rotor. 

TABLE II Drag-Cup Rotor from Neural Net (Torque in 

Nm) 

Test# 

TABLE I Drag-Cup Rotor Simulations 

Test# Inner Outer Starting Maximum 

 

Radius 

(mm) 

Desired 

Start 

Torque 

Radius 

(mm) 

Desired 

Max 

Torque 

Start 

Torque 

Torque 

(Nm) 

Max 

Torque 

Torque 

(Nm) 

1 25 27 6.12 17.93 

2 25 28 6.74 25.56 

3 25 29 6.82 32.84 

4 25 30 6.65 39.25 

 

Averag 

eError 

1 6.12 17.93 6.52 17.87 3.40% 

2 6.74 25.56 7.02 24.85 3.44% 

3 6.82 32.84 6.76 31.4 2.57% 

4 6.65 39.25 6.58 36.31 4.23% 

Table I shows a typical set of parameters for the drag-cup 

rotor. A large range of values over each parameter was 

used to generate the training and testing sets for the 

neural network and the torque results computed using a 

finite element code (MagNet [7]). The network was 

constructed and trained using the MatLab Neural 

Network toolbox. Once trained, the network predictions 

were tested on a set of 50 samples. Each sample was also 

evaluated using the finite element analysis and the results 

were compared. The average error over the whole set was 

4%. Table II shows some typical results. 

Following on from these results, the complexity was 

increased by considering a squirrel-cage rotor, i.e. a 

structure consisting of a set of conducting bars in slots on 

the rotor. 


B. The Squirrel-Cage Rotor. 

A range of squirrel-cage rotor designs were 

constructed to work with a 4 pole, 3 phase stator, shown 

in Fig. 6. The variables in the rotor were the number of 

bars, the size of the bars and the diameter of the rotor. 

Fig. 7 shows the architecture of the squirrel-cage. A 

number of combinations of these parameters were 

produced and the torque-speed curves generated, again 

using the MagNet software. Results were generated for a 

range of values of each parameter resulting in 144 

models in the database Table III shows the parameters 

and the ranges used. The number of conduction bars was 

set to an integer corresponding to the most commonly 

used values for a 4 pole system. Each rotor geometry was 

simulated for a range of frequencies from 0 to 60 Hz and 

the starting and peak torques recorded, as well as the 

torque-speed curve. 

Table III Ranges of Parameters for Squirrel- 

Cage Rotor 

Parameter Minimum Maximum 

Radiusof 

Conduction 

Bars(mm) 0.5 2 

Radiusof 

Rotor(mm) 28 36 

Thenumberofconductionbarswassettoone 

of15,20,30,35 

Figure 6. The 4 Pole, 3 Phase Stator Design used with 

the Sizing System. 

The network was developed following the process 

described in Fig. 3 and, once trained, was used to size a 

rotor for a particular specification. The neural network 

sizing estimates were then compared with an analysis of 

the designed rotor in MagNet and an average error of 

around 9% was generated over all the samples for a 

network with 20 neurons. Thus it is reasonable to state

that the proposed system provided a “sizing” estimate for 

the rotor design which was within the tolerance expected 

at this point in the design process. 

As a last test, the network architecture was varied, i.e. 

to determine the effect of the number of neurons on the 

error in prediction. The resulting errors are shown in Fig. 

8 as a function of the number of neurons. The data in Fig 

8 show the lack of approximation capability of the 

network for low numbers of neurons and the inability to 

generalize for high numbers. The ideal number for this 

problem appeared to be around 20 neurons in the 

network. It is not clear what caused the slight increase in 

error for a 15 neuron network and this bears further 

investigation. 

Figure 7. Basic Conductor Layout for a Squirrel Cage 

Rotor. 

Figure 8. Error between the Finite Element and Neural 

Network Solutions against the Number of Neurons in the 

Network for Starting and Maximum Torques 


The paper has described an approach to developing an 

initial prototype of an electromagnetic device based on a 

limited number of specifications. This is conventionally 

known as “sizing”. The use of a neural network together 

with a pre-computed database of examples, developed 

from a finite element analysis of a range of devices 

covering the design space, has been shown to be effective 

in developing an initial solution. The process of training 

the network is similar to developing the response surface 


for the particular machine examples. The neural network 

acts as a form of surrogate but it is capable of providing a 

solution to the inverse problem unlike the more 

conventional usage of these techniques where the goal is 

to develop an effective forward model. The accuracy of 

the neural network is within the range of existing sizing 

approaches and can probably be improved with a better 

training database. 

REFERENCES 

[1] Dyck, D.N., Lowther, D.A., “Automated Design of Magnetic 

Devices by Optimizing Material Distribution,” IEEE Transactions 

on Magnetics, Vol.32, 3, 1996, pp. 1188-1193. 

[2] Ouyang, J., Lowther, D.A., “A Hybrid Design Model for 

Electromagnetic Devices,” IEEE Transactions on Magnetics, Vol., 

45, 3, 2009, pp. 1442-1445. 

[3] Hawe, G,I,. Sykulski, J.K., “The Consideration of Surrogate 

Model Accuracy in Single-Objective Electromagnetic Design 

Optimization,” Proceedings of the 6 th International Conference on 

Computational Electromagnetics, 2006, pp.1-2. 

[4] Wang, L., Lowther, D.A., ”Reducing the Design Space of 

Standard Electromagnetic Devices using Bayesian Response 

Surfaces,” IEEE Transactions on Magnetics, Vol. 46, 2010, pp. 

2884-2887. 

[5] Hawe, G., Sykulski, J., “Considerations of Accuracy and 

Uncertainty with Kriging Surrogate Models in Single-Objective 

Electromagnetic Optimisation,” IET Proceedings on Science, 

Education and Technology, Vol. 1, 2007, pp.37-47. 

[6] Aleksander, I., Morton, H., “An Introduction to Neural 

Computing,” London, UK, International Thomson Computer 

Press, 1991. 

[7] Benbouza, N, Louai, F.Z., Nait-Said, N. “Application of Mexhless 

Petrov Galerkin (MLPG) Method in Electromagnetics using 

Radial Basis Functions,” Proceedings of the 4 th IET Conference on 

Power Electronics, Machines and Drives, 2008, pp. 650-655. 

[8] MagNet Users Manual, Infolytica Corporation, 2012.


Exploring and Exploiting Parallelism in the Finite 

Element Method on Multi-core Processors: an 

Overview 

Hussein Moghnieh and David A. Lowther 

Department of Electrical and Computer Engineering, McGill University Montreal, Quebec, H3A 2A7, Canada 

E-mail: hussein.moghnieh@mail.mcgill.ca 

Abstract—Exploring parallelism requires identifying parts of a method or a kernel that can run concurrently. Exploiting 

parallelism involves utilizing techniques aimed at devising an efficient parallel implementation on a given processor. 

Different stages of the Finite Element Method have been found to require different approaches to explore and exploit their 

parallelism. While data locality is essential to gain performance, many approaches to parallelism have been found to not 

exhibit data locality by nature. 

Index Terms—Finite Element Method, incomplete Cholesky preconditioner, matrix assembly, mesh generation, multi-core 

processor, sparse matrix-vector multiplication. 

structure (i.e. maximum number of non-zeros per row and 


the average number of non-zeros per row) has been 

examined. The resulting matrices are shown in TABLE I. 

The matrix naming convention used is an indicator of the 

problem, element mesh size and the type of finite element 

formulation applied. For instance, BDC-1-0.07, indicates 

that the matrix is generated from the BDC problem, and a 

first order (i.e. 1) nodal formulation has been applied on a 

mesh where the maximum size of any triangular element 

is 0.07mm, while BDC-0-1 denotes a matrix that was 

assembled by applying an edge element formulation on a 

mesh where the maximum size of any triangular element 

is 1mm. 

Further, an initial 3D mesh of a transformer (ET) 

model has been refined multiple times and a first-order 

nodal finite element formulation has been applied on each 

of the refined meshes. The resulting matrices are denoted 

by ET and are shown in TABLE I. 

The introduction of the multi-core processor by IBM 

(i.e. the POWER4) in 2001, and later by Intel and AMD, 

has rekindled the interest in using parallel computing to 

accelerate computations in an electromagnetic (EM) field 

simulation software running on a desktop computer. 

Since then, a considerable amount of research effort has 

been invested in investigating the methods and kernels 

executed in field simulation software; these include mesh 

generation, matrix assembly, sparse matrix-vector 

multiplication (SMVM) and iterative solver 

preconditioning techniques such as incomplete LU 

factorization (ILU). Despite having achieved a degree of 

performance gain, several shortcomings have reduced the 

effectiveness of those techniques in achieving the 

ultimate performance goal of a field analysis software, 

which is the reduction of the overall time to design a 

device. These impediments include the problem size and 

structure as well as the architecture of the multi-core 

processor. 

This paper intends to illustrate the degree of 

parallelism which might be expected in each of the design 

and analysis stages of a process based around the finite 

element method (FEM), in addition to discussing several 

issues and bottlenecks that arise while exploiting 

parallelism on a multi-core processor. In particular, it is 

intended to examine the gains due to parallelism on 

realistic electromagnetic design examples, i.e. a 2D 

brushless DC motor model and a 3D transformer model. 

II. METHODOLOGY 

An initial 2D mesh of a brushless DC (BDC) motor 

model has been refined multiple times, by setting an 

upper limit on the area of the elements in each refinement 

step, in order to create a range of typical meshes and 

mesh sizes. Subsequently, first order and second order 

nodal formulations, in addition to an edge formulation 

have been applied on each mesh and a matrix has been 

assembled in each case. The effect of applying different 

formulations and mesh sizes on the matrix size (i.e. 

degrees of freedom and number of non-zeros) and matrix 

TABLE I 

MATRICES PROPERTIES 

Matrix DOF NNZ Ave. 

(Max) 

nnz/row 

CSR size 

(MB) 

BDC-1-0.5 38,084 259,188 7 (12) 3.2 

BDC-1-0.07 1,194,044 8,334,798 7 (22) 100 

BDC-1-0.04 3,152,216 22,000,128 7 (33) 264 

BDC-2-3 48,031 407,733 9 (37) 5 

BDC-2-0.07 4,787,651 40,664,669 9 (43) 484 

BDC-2-0.04 12,660,592 107,560,044 9 (60) 1,280 

BDC-0-1 55,772 278,168 5 (5) 3.4 

BDC-0-0.07 3,492,389 17,931,763 5 (5) 219 

BDC-0-0.04 9,492,389 47,437,511 5 (5) 579 

ET-1-0.08 38,234 549,047 15 (36) 6.4 

ET-1-0.04 409,531 5,999,230 15 (31) 70 

ET-1-0.01 1,975,427 28,927,159 15 (39) 339 

Subsequently, the parallel performance and bottlenecks 

encountered in an efficient implementation of important 

FEM kernels, particularly matrix assembly, sparse 

matrix-vector multiplication, and preconditioning 

techniques based on incomplete LU factorizations, are 

investigated.

III. PARALLEL MATRIX ASSEMBLY 

The process of matrix assembly is not considered to be 

time consuming. It is an process, since it consists of 

iterating once over all mesh elements. For each element, 

two operations are performed. The first is to approximate 

the solution of the field within each element which would 

result in a dense matrix structure for each mesh 

element e where u depends upon the formulation and the 

number of unknowns in an element. The second operation 

is to map each entry of the dense matrix to a global 

matrix A. The latter step constitutes a significant portion 

of the total assembly cost mainly because the global 

matrix A is sparse. Inserting and updating entries in a 

sparse matrix, even when its structure is a priori known, 

is not trivial, such is the case when using compressed 

sparse row (CSR). 

In the case of matrix assembly in FEM, the maximum 

number of non-zeros in any row can be roughly estimated 

since it depends on the FEM formulation used. In such a 

case, a more suitable choice of a sparse storage than the 

CSR is to use the ELLPACK sparse storage scheme [1]. 

The ELLPACK sparse format stores a sparse matrix into two dense data structures 

(ELL_values and ELL_column_ind) as shown in Figure 1. 

ELL_values stores the values of non-zeros in each row in 

a condensed form and pads the remaining spaces with 

zeros. ELL_column_ind stores the column index of each 

corresponding non-zero in the ELL_values and “-1” for 

the padded non-zeros. The size W corresponds to the 

maximum number of non-zeros per row. When the 

number of non-zeros per row is less than W, zeros are 

padded to fill the remaining locations. 

Mutex 

objects 

Values per 

row counter 

00 01 

2 00 01 0 0 0 1 1 1 

11 14 

2 11 14 0 0 1 4 1 1 

20 22 25 

3 20 22 25 

0 

0 2 5 1 

32 33 35 

3 32 33 35 

0 

2 3 5 1 

44 45 

2 44 45 0 0 4 5 1 1 

50 51 52 55 

4 50 51 52 55 0 1 2 5 

nxn sparse matrix nxw values nxw 

column indices 

Figure 1: Synchronized ELLPACK sparse storage. 

ELLPACK sparse format 

ELL_values ELL_colum_ind 

The performance of parallel matrix assembly using 

atomic operations on multi-core processors has been 

investigated. Mutual exclusion (mutex) objects from the 

POSIX threads (Pthread) library were used to 

synchronize access of multiple threads to a shared 

resource, which in our case, is the matrix A. For this 

purpose, an array of mutex objects was created where 

each object corresponds to a row in the global matrix as 

shown in Figure 1. Typically, in order for a thread to add 

or modify entries on a row of the global matrix, it must 

acquire a lock on the mutex object corresponding to that 

row. After the thread finishes its modifications, it releases 

the lock to make it available for other threads. For 

example, a thread that is assembling an element of 3 

unknowns (1, 2, 3) must aggregate the total of 

entries in the global matrix. Each 3 of these entries is 

added onto the same row of the global matrix; hence, a 


total of 3 locks are required on 3 different mutex objects. 

This is illustrated in Algorithm 1 (line 6). 

Algorithm 1: Parallel matrix assembly using atomic operations. 

Figure 2 shows the runtimes in seconds of the parallel 

assembly of 3 matrices using a first-order nodal finite 

element formulation on a quad-core Intel i7 processor. 

The sequential runtimes are small (a few seconds) despite 

the fact that these matrices are considered to be those of 

realistic average size problems. The runtimes were 

reduced by more than 50% relative to 1-thread execution 

when the number of threads was 4. Notice the difference 

in runtimes between sequential execution (no 

synchronization) shown in horizontal lines and runtimes 

of 1 thread. This difference highlights the cost of calling 

the Pthread Application Programming Interface (API) 

times. The overhead of calling a Pthread API 

although it appears to be large in here, is not the main 

concern in multi-threaded applications. Instead it is the 

wait time that could incur when a thread is waiting for a 

mutex object to be released by another thread. In matrix 

assembly, this occurs when threads are simultaneously 

processing mesh elements that share vertices and edges. 

In the case of FEM, the possibility of threads waiting to 

acquire a lock is small since the number of shared 

vertices or edges is low; it is related to the average 

number of non-zeros per row. 

Figure 2: Parallel matrix assembly timings in seconds on an 

Intel quad-core i7-860 processor.

A. Parallel Matrix Assembly Synchronization and 

Cache Data Locality 

The time it has taken to complete the matrix assembly 

process in the previous experiments was very small (only 

a few seconds), hence, it was not possible to accurately 

measure the total time spent on synchronization (i.e. 

calling the Pthread API and waiting to acquire a mutex 

lock). Instead, Intel’s VTune Amplifier [2] was used to 

count the number of execution cycles spent on 

synchronization. In the case of matrix assembly using 1 

thread, this number constituted around 9% of the total 

cycles spent on matrix assembly (see Figure 3). This 

number reflects only the time to call the Pthread API, 

since there was no time or cycles wasted waiting to 

acquire a lock (no other threads were competing to 

acquire a lock). When using 4 threads, more cycles were 

halted during synchronization, and in this case the 

percentage of time wasted increased to 24% (see Figure 

4). 

91% 

Matrix assembly execution cycles 

Pthreads Lock / Unlock execution cycles 

Figure 3: Execution cycles of matrix assembly using 1 thread. 

76% 

Figure 4: Execution cycles of matrix assembly using 4 threads. 

IV. SPARSE MATRIX-VECTOR MULTIPLICATION 

It is well established that matrix-vector multiplication 

( ) exhibits a low floating-point operations 

(FLOP) count to memory access ratio, regardless of 

whether A is dense or sparse [3, 4]. This low ratio of 

FLOP/BYTE makes SMVM a memory bandwidth 

limited problem requiring the use of optimization 

techniques which efficiently use the memory hierarchy 

system (main memory, caches and registers). 

The experiments conducted and presented in this 

section aim at analyzing both the effectiveness and the 

limitation of the commonly used SMVM optimization 

techniques when applied on the matrix set described in 

TABLE I. 

Instead of using the ELLPACK storage described 

above which could incur a large number of padded zeros 

in matrices arising from a high order finite element 

formulation, a variation of this storage, known as the 

Hybrid (HYB) storage, is used instead. In this storage 

scheme, some of the non-zeros are stored in a coordinate 

list format (COO) so as to minimize the number of 

padded zeros in the ELLPACK storage as illustrated in 

9% 

Matrix assembly execution cycles 

Pthreads Lock / Unlock execution cycles 

24% 


Figure 5. 

00 01 

11 14 

20 22 25 

40 

41 

32 33 35 

44 45 

50 51 52 55 

ELLPACK 

sparse format 

ELL_values ELL_colum_ind 

00 01 

11 14 

0 

0 

0 

1 

1 

4 

1 

1 

Coordinate (COO) list 

sparse format 

20 22 25 0 2 5 

COO_values 45 55 

32 33 35 2 3 5 COO_row_ind 4 5 

40 41 44 0 1 4 COO_col_ind 5 5 

50 51 52 0 1 2 

Fillin 

Nonzeros stored in 

COO 

Figure 5: Hybrid (HYB) storage scheme. Some non-zeros are 

stored in a coordinate list format (COO) in order to reduce the 

total number of padded zeros. 

In order to evaluate the magnitude of the impact of 

accessing X on SMVM performance, the multiplication 

by X[column] was replaced by X[i] (Algorithm 2, line 7). 

Although this multiplication yielded an incorrect result, 

the aim was to show an upper bound on performance gain 

in cache blocking (i.e. no cache misses on X). 

Algorithm 2: Modified SMVM to eliminate the effect of cache 

misses on . 

Figure 6: BDC-1: SMVM performance when using cache 

blocking on . 

Eliminating the cache misses of has increased the 

performance of SMVM significantly (as anticipated) 

when the matrix was unstructured (i.e. BDC-1) as shown 

in Figure 6 (Natural ordering). To further validate the 

results, the set of matrices in TABLE I (BDC-1) were 

ordered to reduced their bandwidth using the Reverse

Cuthill-McKee (RCM) technique [5]. When the matrices 

were ordered using RCM, the performance of SMVM 

using cache blocking was close to the performance of 

SMVM without cache blocking (Figure 6), since cache 

misses were reduced due to the ordered access pattern on 

. 

A. Loop Setup Overhead 

One of the factors that has been argued to be contributing 

to reducing the performance of SMVM is the low number 

of non-zeros per row[6]. For each row of the matrix A, 

the inner loop of the SMVM code, whether using the 

CSR storage (as shown in line 5 of Algorithm 2) or using 

the HYB storage, iterates over the row's non-zeros and 

multiplies them by the corresponding entries in . When 

only a few non-zeros are present, the inner loop setup 

overhead time would dominate the calculation time and 

would not be able to be amortized over the short 

calculation time of a few non-zeros. Since the set of FEM 

matrices used in this work falls within this category (i.e. 

low per row) a test examining the degradation of 

the SMVM performance due to the inner loop setup 

overhead has been carried out by replacing the inner loop 

of SMVM with a set of instructions which explicitly 

multiply each element of by its corresponding element 

in ; this technique is often referred to as “loop 

unrolling”. “Loop unrolling” has been made possible by 

the use of the ELLPACK (or Hybrid) sparse format since 

the number of non-zeros per row is fixed, hence the 

number of times an inner loop executes its inner 

instruction is fixed. In such a case, the inner loop can be 

eliminated and the instruction within the inner loop can 

be replaced by explicitly writing the set of instructions 

that would have been executed by the inner loop. 

Algorithm 3 illustrates a sparse matrix-vector 

multiplication using the ELLPACK storage. Assuming 

that the width of the ELLPACK storage is 7, the inner 

loop which multiplies the non-zeros of a row by the 

corresponding locations in is replaced by seven 

instructions. The effect of this technique on the 

performance of SMVM when applied on BDC-1 matrix 

test set is shown in Figure 7. It can be seen that while loop 

unrolling did increase SMVM performance, it was not as 

significant as the performance gain obtained from 

eliminating cache misses on . 

Algorithm 3: SMVM loop unrolling using NVIDIA's Hybrid 

sparse storage. 


Figure 7: BDC-1: Loop unrolling and cache blocking (singleprecision 

floating-point operations). 

B. SMVM memory bandwidth 

Figure 8 shows the memory bandwidth when executing 

SMVM using different optimization techniques. The 

sustainable memory bandwidth obtained from executing 

the STREAM benchmark [7] on an Intel i7-860 processor 

is also shown on the same figure. The widely used 

STREAM benchmark serves as an indicator of the 

realistic performance of the memory subsystem of a 

particular processer. In this benchmark, a set of kernels is 

applied on a dense data structure chosen to be larger than 

the available cache of a particular processor. 

Figure 8: BDC-1: SMVM sustainable memory bandwidth 

(MB/s) on Intel i7-860 processor. 

In general, a naïve implementation of SMVM (i.e. no 

optimization) would work well below 50% of the 

STREAM benchmark sustained memory bandwidth, 

while an optimized SMVM (with cache blocking) 

attained 70% of the STREAM benchmarks. The 

implication of these results highlights the effect of using 

sparse storage, which introduces additional memory 

fetches due to indirect addressing which also prevents 

efficient memory pre-fetching by the processor. A similar 

observation has been found when running the same 

experiments on an older generation of quad-core

processor; AMD’s dual-socket, dual-core Opteron 2214 

processor. 

C. Parallel SMVM 

This section compares the sequential and parallel 

performance of SMVM kernels when applied to matrices 

obtained from the set described in TABLE I and a 

miscellaneous matrix test set obtained from “the 

University of Florida Sparse Matrix Collection” [8] 

shown in TABLE II. The latter set has been widely used in 

the past few years by researchers to evaluate the 

performance of SMVM algorithms. The results of our 

evaluation are shown in Figure 9. 

TABLE II 

MISCELLANEOUS MATRIX TEST SET 

Matrix DOF NNZ Ave. (Max) 

nnz/row 

CSR 

size 

(MB) 

Protein 36,417 4,344,765 120 (204) 50 

Sphere 83,334 6,010,480 73 (81) 69 

Cant. 62,451 4,007,383 65 (78) 46 

Tunnel 217,918 11,524,432 53 (180) 133 

CFD 46,835 2,374,001 50 (145) 27 

Ship. 140,874 7,813,404 26 (68) 42 

Econ. 206,500 1,273,389 7 (74) 16 

Epidem. 525,825 2,100,225 4 (4) 26 

Circuit 170,998 958,936 6 (353) 12 

The following observations were concluded from the 

results shown in Figure 9: 

The sequential performance of SMVM kernels when 

the size of a matrix fits in the available processor 

cache is significantly higher than when the matrix 

does not fit in the cache (e.g. BDC-1-0.5 and ET-0- 

0.5). 

Figure 9: Parallel SMVM using HYBRID storage (doubleprecision 

floating-point operations) 


Matrices that have a high percentage number of nonzeros 

per row attained higher GFLOPS than matrices 

with short row lengths. This is not due to the overhead 

caused by the inner-loop of SMVM (as demonstrated 

in section III.A), but to the ratio of the DOF and 

NNZ. In general, matrices arising in FEM have high 

ratios of DOF over NNZ, which explains the low 

performance relative to other matrices. This explains 

also why matrices obtained from 3D first-order finite 

element analysis attained higher GFLOPS than 

matrices obtained from 2D analysis. 

The performance of parallel SMVM is affected by the 

distribution of non-zeros in a row. Matrices arising 

from FEM have a balanced distribution of the number 

of non-zeros per row, leading to better thread 

utilization and subsequently to higher GFLOPS. 

V. PRECONDITIONING TECHNIQUES:INCOMPLETE 

CHOLESKY AND INCOMPLETE CHOLESKY WITH FILL-INS. 

There are two techniques to solve a system of linear 

equations where is the coefficient matrix and 

is the right hand side vector. The first is to use direct 

solver methods and the second is to use iterative methods. 

The direct solver methods rely on decomposing the 

coefficient matrix, , into upper and lower triangular 

matrices and, where . This is a robust 

method. However, it is not useful for large systems, since 

the triangular matrices L and U lose their sparsity, as zero 

entries in the coefficient matrix turn into non-zero 

entries in and . Those new entries are referred to as 

fill-ins. 

A less robust technique is based on iterative 

approaches, such as the conjugate gradient method (CG). 

This method requires a large number of iterations over 

the system of linear equations to reach the solution. The 

number of iterations depends upon the condition number 

of the matrix in . This condition number can be 

reduced (i.e. leading to less CG iterations) if a 

preconditioner that is based on the incomplete 

factorization of is applied to the CG method[9]. 

Incomplete factorization derives its name from the 

direct method discussed above. It uses the same 

elimination algorithm to decompose the matrix into an 

and , which are an approximation of and, 

obtained by dropping some fill-in entries. One of the 

dropping strategies during ILU factorization is to drop all 

fill-ins so that the sparsity of and matches that of the 

original matrix A. This dropping rule gives rise to an 

ILU(0) or IC(0) (incomplete Cholesky in the case of 

symmetric matrices) preconditioner [10], where the zero 

denotes that no fill-ins are allowed. Incomplete Cholesky 

with no fill-ins has been the preconditioner of choice on a 

desktop computer mainly due to its ability to reduce the 

number of iterations of a PCG while being inexpensive to 

produce and to compute on a desktop computer. The 

structures of the factors and are a priori known, 

making it easy to pre-allocate the storage requirement, 

without the need for symbolic factorization. An efficient 

implementation would be to duplicate the lower part of A 

and then perform an in-place factorization by going in an

ordered manner over the entries of each row. A very 

efficient implementation is found in the SparseLib++ 

library [11]. TABLE III shows the execution times of 

creating an IC(0) preconditioner. 

Matrix Degrees of 

freedom 

TABLE III 

INCOMPLETE CHOLESKY PERFORMANCE 

Upper 

triangle 

NNZ 

CSR size 

(MB) 

IC(0) time 

(sec.) 

BDC-1-0.5 38,084 147,636 1.9 0.0275 

BDC-1-0.1 632,883 2,521,428 31.3 0.4987 

BDC-1-0.04 3,152,216 12,576,171 155 2.594 

ET-0.08 38,324 293,643 3.5 0.1359 

ET-0.04 409,531 3,204,372 38.2 1.554 

ET-0.01R 2,666,039 21,100,983 252 10.3244 

In order to improve the convergence rate of PCG 

beyond that provided by using the IC(0) preconditioner, 

much research has focused on extending the idea of the 

incomplete Cholesky preconditioner by allowing fill-ins 

to occur. There are two heuristics used to control the 

amount of fill-in. The first is based on a drop tolerance 

criterion, known as the Incomplete LU Threshold (ILUT) 

through which entries are dropped if their values are 

below a preset threshold. The second is based on the level 

of fill-in known as ILU, where symbolic factorization, 

using graph theory, is carried out to identify the locations 

of the fill-ins and their level in the graph. The fill-in 

entries that exceed a given level are dropped. Matrix 

elements are assigned a level 0, hence IC(0) discards all 

fill-ins and the resulting factorized matrix has the same 

sparsity pattern as the original matrix. One way to 

calculate the level of a fill-in is to use the sum rule as 

shown in (1). This rule gives rise to a symbolic 

factorization algorithm described by Hysom [12] that is 

amenable to parallelization. The sparsity of each row in 

the final preconditioner can be evaluated independently 

from the other rows. Figure 10 demonstrates the 

scalability of this algorithm. Despite that, the runtime of 

the symbolic factorization is considered to be a 

bottleneck in our case mainly due to the large number of 

fill-ins that incurred in the final ILU preconditioners 

(where =1, 2 or 3) as shown in TABLE IV. 

level(i, j) min {level(i, k) level(k, j) 1} (1) 

1hmin{i, j} 


Figure 10: Execution times of parallel symbolic factorization of 

BDC-1-0.1 where . The results demonstrate that the multithreaded 

implementation of Hysom’s algorithm is highly 

scalable. 

Matrix IC(0) 

TABLE IV 

FILL-INS 

ILU(1) ILU(2) ILU(3) 

BDC-1-0.5 148,636 225,244 321,101 429,566 

(51%) (116%) (189%) 

BDC-1-0.1 2,521,428 3,848,266 5,544,448 7,380,993 

(52%) (120%) (193%) 

ET-0.08 293,643 700,914 1,368,930 2,555,737 

(139%) (366%) (770%) 

ET-0.04 3,204,372 7,927,979 15,963,746 30,758,154 

(147%) (398%) (860%) 

VI. PRECONDITIONER BACKWARD-FORWARD 

SUBSTITUTION 

The next step is to investigate the degree of parallelism 

(i.e. the number of operations that can be executed 

simultaneously) that can be attained when solving a 

preconditioner (by backward and forward substitution) 

within a PCG iteration obtained from the matrix BDC-1- 

0.5 (i.e. a 2D problem) and the matrix ET-0.08 (i.e. a 3D 

problem). A histogram will be used to depict the 

maximum degree of parallelism and the number of steps 

required to solve each of the preconditioners. The x-axis 

shows the number of steps required to solve a matrix, and 

the y-axis of the histogram shows the number of rows 

that can be solved simultaneously at a given step. Figure 

11 and Figure 12 show the rows dependency histograms of 

ILU(1) and ILU(3) of the matrix BDC-1-0.5 respectively. 

The maximum degree of parallelism of ILU(1) was 1,151 

and the number of steps required to solve the 

preconditioner was 196. On the other hand, the ILU(3) 

preconditioner of the same problem had a maximum 

degree of parallelism equal to 453 and 399 steps were 

required to solve it. The more fill-ins that existed in a 

preconditioner, the less parallelism could be exploited. 

Solving a preconditioner obtained from a 3D problem 

is less amenable to parallelism than that obtained from a 

2D problem. For instance, an ILU(1) preconditioner 

obtained from ET-0.08 (3D electric transformer problem) 

can be solved in 12,559 steps where the maximum 

number of rows that could be solved simultaneously is 

only 16 (see TABLE V) and an ILU(3) preconditioner of 

the same problem can be solved in 24,125 steps where the

maximum attainable degree of parallelism is only 16 (see 

TABLE VI). A 2D problem that has the same number of 

degrees of freedom as ET-0.08 (i.e. BDC-1-0.5) was 

more amenable to parallelism. 

1200 

1000 

800 

600 

400 

200 

0 

0 20 40 60 80 100 120 140 160 180 200 

Figure 11: Degree of parallelism (y-axis) attained when solving 

ILU(1) of BDC-1-0.5. The x-axis represents the number of 

sequential steps to finish the solve stage. 

500 

450 

400 

350 

300 

250 

200 

150 

100 

50 

0 

0 50 100 150 200 250 300 350 400 

Figure 12: Degree of parallelism (y-axis) when solving ILU(3) 

of BDC-1-0.5. The x-axis represents the number of sequential 

steps to finish the solve stage. 

TABLE V 

ILU(1) OF ET-0.08 

Preconditioner NNZ Max. Solving 

level and ordering 

degree of 

parallelism 

steps 

ILU(1)-Natrual 700,914 69 12,559 

ILU(1)-AMD 673,511 207 569 

ILU(1)-RCM 585,646 36 3,491 

TABLE VI 

ILU(3) OF ET-0.08 

Preconditioner NNZ Max. Solving 

level and ordering 

degree of 

parallelism 

steps 

ILU(3)-Natrual 2,555,737 16 24,125 

ILU(3)-AMD 1,937,345 119 1,690 

ILU(3)-RCM 1,984,093 12 10,548 

Since the preconditioner obtained from the 3D 

transformer problem exhibited a low degree of 

parallelism, approximate minimum degree (AMD) [13] 

and Reverse Cuthill–McKee (RCM) orderings were first 

applied on the ET-0.08 matrix before generating ILU(1) 

and ILU(3) preconditioners. Although, it has been 

established in the literature that orderings to reduce fillins 

or increase parallelism (RCM and AMD) degrade the 

quality of the preconditioner and lead to more PCG 


iterations than when using Natural ordering [14], [15], the 

aim of this experiment was to only focus on the 

implication of ordering in terms of reducing the overall 

solver time. 

The number of non-zeros in the upper triangle 

preconditioner, the maximum degree of parallelism and 

the solving steps of both preconditioners ILU(1) and 

ILU(3) using different orderings are summarized in 

TABLE V and TABLE VI respectively. AMD ordering 

resulted in a relatively higher parallelizable 

preconditioner solver than Natural and RCM orderings. 

On the other hand, RCM exhibited a similar degree of 

parallelism to that of the Natural ordering but required 

less solve steps. The reason being that RCM reduces the 

bandwidth of the matrix and balances the distribution of 

non-zeros between rows. This implies that there is a 

balance in the degree of parallelism among steps, which 

will translate into balanced threads utilization. 


A. Matrix Structure 

Given the dependency of the sparse storage upon the 

problem structure, it is important to devise matrix test 

sets that are relevant to the problem domain (i.e. low 

frequency electromagnetic analysis using the Finite 

Element Method). One of the advantages of matrices 

generated in FEM is the absence of a large discrepancy in 

the number of non-zeros between rows. This enables the 

use of a sparse storage technique such as ELLPACK that 

pre-allocate memory to store the coefficient matrix. 

B. 2D vs 3D Analysis 

Matrices generated from 2D finite element analysis are 

less dense that those generated from 3D problems. 

SMVM attained more GFLOPS in the case of a denser 

matrix (i.e. first-order finite element formulation of a 3D 

problem). However, ILU preconditioner generated in the 

case of a 3D problem was less amenable to parallelism 

than a preconditioner of a 2D problem. 

C. Matrix Ordering 

Overall, although parallelism can be explored and 

exploited in most of the examined FEM kernels, the main 

bottleneck remains the solver part of the FEM process. 

There are many sub-kernels that are executed within a 

large number of loops. This places a stringent 

requirement on sparse data structures as there is no gain if 

these structures change in between sub-kernels. Further, 

in each sub-kernel within the solver, there is a large 

number of simple operations to be executed. The number 

of operations is related to the degrees of freedom of the 

matrix and the operations’ complexity is related to the 

number of non-zeros per row. These simple operations 

are memory bandwidth limited requiring that each 

operation be optimized in terms of memory access. 

Hence, single thread optimization remains the most 

essential part of the solver’s optimization. 

Reverse Cuthill-McKee (RCM) ordering has been 

found to be beneficial assuming that it will not degrade 

the performance of PCG. It balances threads utilization

when solving a preconditioner and also in enhances cache 

performance in SMVM. 

REFERENCES 

[1] R. G. Grimes, D. M. Young, and D. R. Kincaid, "ITPACK 2.0: 

User's Guide," CNA-150, Center for Numerical 

Analysis,University of Texas,Austin,Texas 

,August 1979. 

[2] Intel, "Intel VTune Amplifier XE 2011," 2011 ed: Intel 

Corporation, 2011. 

[3] S. Toledo, "Improving the memory-system performance of sparsematrix 

vector multiplication," IBM Journal of Research and 

Development, vol. 41, pp. 711-725, 1997. 

[4] R. Vuduc, "Automatic performance tuning of sparse matrix 

kernels," PhD thesis, University of California,Berkley, 2003. 

[5] E. Cuthill and J. McKee, "Reducing the bandwidth of sparse 

symmetric matrices," presented at the Proceedings of the 1969 24th 

national conference, 1969. 

[6] S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. 

Demmel, "Optimization of sparse matrix-vector multiplication on 

emerging multicore platforms," Parallel Computing, vol. 35, pp. 

178-194, 2009. 

[7] J. D. McCalpin, "STREAM: Sustainable memory bandwidth in 

high performance computers," STREAM: Sustainable Memory 

Bandwidth in High Performance Computers, 1991. 

[8] T. A. Davis and Y. Hu, "The University of Florida sparse matrix 

collection," ACM Transactions on Mathematical Software (TOMS), 

vol. 38, p. 1, 2011. 

[9] H. A. Van der Vorst, "Preconditioning by incomplete 

decompositions," Ph.D. Thesis, University of Utrecht, 1982. 

[10] J. Meijerink and V. DER VORST, "An iterative solution method 

for linear systems of which the coefficient matrix is a symmetric 

M-matrix," Mathematics of Computation, vol. 31, pp. 148-162, 

1977. 

[11] R. Pozo and K. Remington, "Sparselib++ v. 1. 5 sparse matrix class 

library. reference guide," NASA, 1996. 

[12] D. Hysom and A. Pothen, "Level-based incomplete LU 

factorization: Graph model and algorithms," Preprint UCRL-JC- 

150789, US Department of Energy, p. 17, 2002. 

[13] P. R. Amestoy, T. A. Davis, and I. S. Duff, "An approximate 

minimum degree ordering algorithm," SIAM Journal on Matrix 

Analysis and Applications, vol. 17, pp. 886-905, 1996. 

[14] I. S. Duff and G. A. Meurant, "The effect of ordering on 

preconditioned conjugate gradients," BIT Numerical Mathematics, 

vol. 29, pp. 635-657, 1989. 

[15] S. Doi and T. Washio, "Ordering strategies and related techniques 

to overcome the trade-off between parallelism and convergence in 

incomplete factorizations," Parallel Computing, vol. 25, pp. 1995- 

2014, 1999. 



Diagnosis of real cracks from the three spatial 

components of the eddy current testing signals 

M. Rebican∗ , L. Janousek † ,M.Smetana † , T. Strapacova † ,A.Duca∗and G. Preda∗ ∗ University Politehnica of Bucharest, Spl. Independentei 313, Bucharest 060042, Romania 

† Faculty of Electrical Engineering, University of Zilina, Univerzitna 1, 010 26 Zilina, Slovakia 

E-mail: mihai.rebican@upb.ro 

Abstract—This paper presents a novel approach for diagnosis of real cracks from two-dimensional eddy current testing 

signals by means of a stochastic method, such as tabu search. A new testing probe driving uniformly distributed eddy 

currents is employed for the inspection. Three spatial components of the perturbation field due to partially conductive 

cracks are sensed as the response signals in order to enhance information level of the inspection. The signals are simulated 

by a fast forward FEM-BEM solver using a database. Two crack models are proposed for the inversion: a crack with cuboid 

shape and a crack with more complex shape. In the both cases, the cracks have uniform conductivity. The length, depth, 

width and conductivity of the crack are unknown in the inversion process. Numerical results of the 3D reconstruction of 

partially conductive cracks from simulated 2D signals with added noise are presented and discussed. 

Index Terms—partially conductive cracks, diagnosis, eddy current testing, tabu search. 


Eddy current testing (ECT) is one of the most common 

electromagnetic methods employed in non-destructive 

evaluation of conductive materials. The principle of 

ECT underlies in the interaction of induced eddy currents 

with a structure of an examined conductive body 

based on the electromagnetic induction phenomena. The 

method is widely applied in various fields accounting for 

measurements of material thickness, proximity measurements, 

corrosion evaluation, sorting of materials based on 

the electromagnetic properties. However, the most wide 

spread area of its application in present is the detection 

and possible diagnosis of discontinuities. 

Real cracks, such as stress corrosion cracks (SCC), 

usually appear in steam generator (SG) tubes of pressurized 

water reactor (PWR) of nuclear power plants. 

Recently, quite satisfactory results are reported by several 

groups for automated evaluation of artificial slits [1] and 

even for several parallel notches [2] using eddy current 

testing (ECT). However, evaluation of real cracks from 

ECT signals remains still very difficult. 

In the case of artificial EDM notches, the width is 

usually considered fixed in the inversion process of ECT 

signals. However, for cracks with non-zero conductivity 

the width affects the signal and it has to be considered 

unknown during reconstruction [3]. It means that the 

additional variable should be taken into account for 

evaluation of a detected SCC what considerably increases 

ill-posedness of the inverse problem [4]. Thus, many unsatisfactory 

results are reported when the automated procedures 

originally developed for non-conductive cracks 

are employed in the evaluation of SCCs. It is stated 

that one of the possible reasons is lack of sufficient 

information [1]. 

Several studies of the authors focused on enhancing 

information level of eddy current testing signals and on 

decreasing uncertainty in evaluation [5], [6]. Promising 

results create new challenges concerning development of 

automatic procedures for diagnosis of real cracks. 

In a previous work [2], the authors developed an 

algorithm for reconstruction of multiple artificial slits 

from ECT signals by means of a stochastic optimization 

methods, such as tabu search. The reconstruction of multiple 

cracks was a 3D one. Therefore, the scheme is also 

appropriate for reconstruction of a partially conductive 

crack, when the width is not considered fixed. 

The paper proposes a novel approach for the threedimensional 

reconstruction of partially conductive cracks 

from simulated two-dimensional ECT signals, consisting 

of all the three spatial components of the perturbation 

field. Two crack models are proposed for the inversion. 

The first one has a cuboid shape and the other reflects 

a more complex geometry. Both the crack models consider 

uniform distribution of the partial conductivity. The 

length, depth, width and conductivity of the cracks are 

unknown in the inversion process of the signals. The 

validity of the approach is proved using perturbed ECT 

signals by added noise in the inversion process. 

II. EDDY CURRENT TESTING PROBLEM DEFINITION 

A plate specimen having the electromagnetic parameters 

of a stainless steel SUS316L is inspected in this 

study. The specimen has a thickness of t =10mm, a 

conductivity of σ =1.35 MS/m and a relative permeability 

of μr =1. 

Figure 1 shows the configuration of the plate (region 

Ω0) with a single surface breaking crack (shadow zone) 

located inside the region Ω1. The crack region Ω1 (22 × 

2 × 10 mm3 ) is uniformly divided into a grid composed 

from nx × ny × nz (11×5×10) cells defining a possible 

crack geometry. The dimensions of each cell are 2.0 mm 

in length, 0.4 mm in width, and 1.0 mm in depth. 

A new eddy-current probe proposed by the authors 

is employed for the near-side inspection of the plate 

[7]. It consists of two circular exciting coils positioned 

apart from each other and oriented normally regarding

2 

1 

1 

nz 

Crack 

n y 

Ω 1 

Probe 

1 2 nx 

y 

x 

Scanning 

Fig. 1. Configuration of the plate specimen with a crack. 

the plate surface. The circular coils are connected in 

series but magnetically opposite to induce uniformly 

distributed eddy currents in the plate. The exciting coils 

are supplied from a harmonic source with a frequency 

of 5kHz and the current density 1A/mm 2 . A detection 

system of the probe is composed of three small circular 

coils oriented along three axes perpendicularly to each 

other [5]. The detection system is located in the center 

between the exciting coils to gain high sensitivity as 

the direct coupling between the exciting coils and the 

detectors is minimal at this position. 

Figure 2 shows the configuration of the new probe. 

Dimensions of the detecting coils are as follows: an inner 

diameter of 1.2mm, an outer diameter of 3.2mm and a 

winding height of 0.8mm. 

Two-dimensional scanning, so called C-scan, is performed 

over the cracked surface with a lift-off of 1mm. 

The real and imaginary parts of the induced voltages in 

all three detecting coils corresponding to three spatial 

components of the perturbation electromagnetic field are 

sensed and recorded during the inspection. 

III. PARTIALLY CONDUCTIVE CRACK MODELS 

Partially conductive cracks with a uniform conductivity 

smaller than the conductivity of the base material are 

considered in this paper. Two crack models are proposed 

for the partially conductive cracks. 

In the first model shown in Figure 3, the crack has a 

cuboid shape. The crack parameter vector c consists of 

z 

23 

x 

plate 

35 

14 

Fig. 2. ECT probe configuration. 

exciting 

coils 

10 

detectors 

Ω 0 


1 

1 

2 

n 

z 

2 

ny 

1 2 n 

x 

Fig. 3. Crack region division - cuboid shape of the crack (model 1). 

6integers,c =[ix1,ix2,iy1,iy2,iz,s], whereix1 and 

ix2 are the indices of the first and last cells of the crack 

along the length of crack, iy1 and iy2 are the indices of 

the first and last cells of the crack along the width of 

crack, iz is the number of cells of the crack along the 

depth of crack, and σc = s%σ (σc - the conductivity of 

crack, σ - the conductivity of base material). 

In Figure 3, for a uniform grid with 13×5×10 cells, the 

parameter vector is c =[6, 13, 1, 3, 4, 20]. Thus, 8×3×4 

cells form the crack, and the crack conductivity is σc = 

20%σ. 

The second crack model shown in Figure 4 adopts 

a more complex shape. The crack depth is considered 

as variable along the crack length. The crack parameter 

vector c consists of nx +3 integers, c = 

[iz1,iz2,...,iznx,iy1,iy2,s], whereizk, k = 1,nx is 

the number of cells of the crack along the depth of crack, 

iy1 and iy2 are the indices of the first and last cells of 

the crack along the width of crack, and σc = s%σ (σc 

- the conductivity of crack, σ - the conductivity of base 

material). 

In Figure 4, for a uniform grid with 13 × 5 × 10 

cells, the parameter vector contains 16 integers, as c = 

[0, 0, 0, 0, 0, 8, 4, 1, 2, 5, 3, 6, 4, 1, 3, 30]. Thus, (8+4+1+ 

2+5+3+6+4)× 3 cells form the crack, and the crack 

conductivity is σc = 30%σ. 

In the both models, the cracks have the same orienta- 

1 

1 

2 

n 

z 

2 

ny 

1 2 n 

x 

Fig. 4. Crack region division - complex shape of the crack (model 2).

tion. The width of crack can have the values: 0.4, 0.8, 

1.2, 1.6, 2 mm. 

IV. DIAGNOSIS OF PARTIALLY CONDUCTIVE CRACKS 

The fast-forward FEM-BEM analysis solver using 

database [8], [9] is adopted here for the ECT signals simulation. 

Actually, a version of the algorithm of database 

upgraded by the authors in previous works [2], [10], 

for the computation of the ECT signals due to multiple 

cracks is used in this paper. The database is designed for 

a three-dimensional defect region, and not as usually for a 

two-dimensional one where the crack width is considered 

fixed. Thus, the ECT response signals can be simulated 

also for partially conductive cracks with variable width 

using the same database generated in advance. 

The authors have already developed an algorithm for 

the reconstruction of multiple cracks from ECT signals 

by means of a stochastic optimization method, such as 

tabu search [2]. The reconstruction of multiple cracks 

validated by experimental data was a 3D one. Therefore, 

the scheme is also appropriate for the reconstruction 

of a partially conductive crack, when the width is not 

considered constant. It is well known that the width 

significantly affects the signal for cracks of non-zero 

conductivity [3]. 

Tabu search is employed for the three-dimensional 

diagnosis of a partially conductive crack [2]. The error 

function ε to be minimized is defined as: 

ε(c) = 

j=X,Y,Z 

N 

i=1 

|ΔVij(c) − ΔV m 

ij |2 

, (1) 

N 

i=1 

|ΔV m 

ij |2 

where c is the crack parameter vector of the crack, 

ΔVij(c) and ΔV m 

ij are the simulated (reconstructed) and 

true (measured) induced pick-up voltages of the coils 

(ECT signal) for each spatial component (X, Y and Z 

according to the coordinate system shown in Figures 1 

and 2) at the i-th scanning point respectively, and N is 

the number of scanning points. 

Figures 5-7 show the simulated ECT signal for each 

spatial component (X, Y, Z) caused by a partially conductive 

crack with a cuboid shape, which has the parameters: 

lc =6mm, wc =0.8mm, dc =4mm, σc =5%of σ. 

In this paper, the simulated signals are affected by 

added noise before the inversion process in order to prove 

the validity and robustness of the proposed approach. The 

perturbed signal is computed as: 

(ΔV m 

i )ns =ΔV m 

i (1 ± ns%), (2) 

where ΔV m 

i and (ΔV m 

i )ns are the initial and perturbed 

true signals at the i-th scanning point respectively, ns is 

a random value of an imposed maximum level, NOISE. 

Figure 8 shows the perturbed ECT signal for Z component 

when noise of maximum level 40% is added to 

the simulated signal shown in Figure 7. 


Absolute voltage [mV] 

0.8 

0.6 

0.4 

0.2 

0 

1 

1.8 

1.6 

1.4 

1.2 

15 

10 

5 

-20 -15 0 

-10 -5 0 

-5 y [mm] 

5 

x [mm] 10 -10 

15 20-15 

Fig. 5. X component of the simulated ECT signal. 


0.8 

0.6 

0.4 

0.2 

0 

1 

1.8 

1.6 

1.4 

1.2 

15 

10 

5 

-20 -15 0 

-10 -5 0 

-5 y [mm] 

5 

x [mm] 10 -10 

15 20-15 

Fig. 6. Y component of the simulated ECT signal. 

V. NUMERICAL RESULTS AND DISCUSSION 

The numerical simulations of the cracks reconstruction 

are performed using an ordinary PC: Intel Core 2 Quad 

2.4GHz, 3GB RAM. 

In Table I the numerical results of the reconstruction 

are presented, when a partially conductive crack is 

modeled as a cuboid shape (crack model 1, Figure 3). 

The column denoted ”Real” gives the true dimensions 

(lc × wc × dc) and partial conductivity (σc in % of σ) 

of the crack. The results of the diagnosis are provided 

in the column labelled as ”Reconstructed” for various 

maximum levels of the noise added to simulated ECT 

signals (2). The error function ε(c) (1) of the diagnosis 

are reported, too. 

The time required for reconstruction of one crack is 

approximately 90-120 minutes. When there is no noise


0.8 

0.6 

0.4 

0.2 

0 

1 

1.8 

1.6 

1.4 

1.2 

15 

10 

5 

-20 -15 0 

-10 -5 0 

-5 y [mm] 

5 

x [mm] 10 -10 

15 20-15 

Fig. 7. Z component of the simulated ECT signal. 


0.8 

0.6 

0.4 

0.2 

0 

1 

1.8 

1.6 

1.4 

1.2 

15 

10 

5 

-20 -15 0 

-10 -5 0 

-5 y [mm] 

5 

x [mm] 10 -10 

15 20-15 

Fig. 8. Z component of the perturbed ECT signal by added noise of 

maximum level 40%. 

added to signal (shown in Figures 5-7), the reconstructed 

crack is the same with the real crack (the error function 

ε =0). In the case of diagnosis from signals perturbed 

by noise (maximum level NOISE =10, 20, 30, 40%), 

TABLE I 

RESULTS OF THE PARTIALLY CONDUCTIVE CRACK 

RECONSTRUCTION FOR THE CRACK MODEL 1 

Crack Real Reconstructed 

NOISE (%) 0 10 20 30 40 40 

lc [mm] 6.0 6.0 6.0 6.0 6.0 6.0 6.0 

wc [mm] 0.8 0.8 0.8 0.8 0.8 0.8 1.2 

dc [mm] 4.0 4.0 4.0 4.0 4.0 4.0 4.0 

σc[%] 5 5 5 5 4 4 9 

ε · 10 −3 - 0 11 21 32 42 45 


the crack is exactly reconstructed even if the maximum 

level of noise is high (30, 40%). Starting from another 

initial population of tabu search, and for a perturbed 

signal by added noise of highest level, NOISE =40% 

(the Z component is shown in Figure 8), the result is 

slightly different (the last column in Table I): the length 

and depth are equal to the true values; the width and 

partial conductivity are estimated not exactly but with 

good precision. However, the last two parameters are not 

very important from structural integrity point of view. 

The results clearly show that the crack parameters are 

estimated quite precisely from the noisy ECT signals 

using the proposed approach. 

Figures 9 and 10 show the results of three-dimensional 

diagnosis of the partially conductive crack described in 

Table I (column ”Real”) from 2D ECT signals without 

and with added noise of maximum level 20%, respectively, 

when the complex crack model (Figure 4) is 

employed for the inversion. The inversion procedure 

takes around 5-7 hours. 

In the case of the reconstruction from signal without 

noise, the crack is precisely localized and also its length 

width 

depth 

real reconstructed 

σ c=5%σ 

σ c=3%σ, 

ε=0.004 

Fig. 9. Reconstruction of a conductive crack from signal without noise. 

width 

depth 


σ c=5%σ 

σ c=4%σ, 

ε=0.027 

Fig. 10. Reconstruction of a conductive crack from perturbed signal 

by added noise of maximum level 20%.

width 

depth 


σ c=8%σ σ c=6%σ, 

ε=0.024 

Fig. 11. Reconstruction of an elliptical conductive crack. 

and width are exactly estimated. The depth profile does 

not perfectly copy the true one. However, the maximum 

depth is accurately assessed. But, for the reconstruction 

from signal with added noise of maximum level of 20%, 

the width is smaller with a minimum value of 0.4 mm 

than real width. 

A crack with elliptical profile is also reconstructed 

from signal without noise. The crack opening has a value 

of wc =0.4 mm, its surface length is lc =14mm, the 

maximum depth is dc = 4mm and the crack partial 

conductivity is adjusted to σc =8%of the base material 

conductivity σ. The reconstruction result is shown 

in Figure 11. The crack width and its surface length 

are accurately assessed. The estimated crack position is 

minimally shifted (0.4mm) in the crack width direction 

comparing the true position. The maximum depth is 

slightly overestimated of 1mm. When the signal caused 

by the elliptical conductive crack is perturbed by added 

noise of maximum level of 20%, and then is used in 

reconstruction, the maximum depth is overestimated of 

2mm, but the crack width and its surface length are 

precisely estimated, too. 

The presented results proved effectiveness of the proposed 

novel approach of three-dimensional diagnosis of 

partially conductive cracks, even if cracks with complex 

shape and signals with added noise are considered. ECT 

response signals gained during C-scan together with 

acquiring all three spatial components of the perturbation 

electromagnetic field significantly improve the preciseness 

of inversion process using tabu search stochastic 

method. 


A novel approach for three-dimensional diagnosis of 

partially conductive cracks has been proposed in the 

paper. A special eddy current probe driving uniformly 

distributed eddy currents was used for the inspection of 

a plate specimen. A detection system of the probe was 

designed in such a way that all three spatial components 

of the perturbation electromagnetic field were acquired. 


The tabu search stochastic method was employed for the 

reconstruction of partially conductive cracks profile from 

eddy current response signals gained during the C-scan 

of the probe. The signals were perturbed by added noise. 

Two crack models were proposed: a crack with cuboid 

shape and the other one with more complex shape. 

The length, depth, width and conductivity of the crack 

were considered unknown in the inversion process. The 

conductivity of the crack was uniform. 

The presented results proved that the proposed approach 

allows quite precisely reconstructing threedimensional 

profile of a crack together with its partial 

conductivity from signals with added noise. 

Further work of the authors will concern more realistic 

shapes of cracks and validation with measured data from 

natural cracks (SCC). 


This work has been co-funded by the Sectoral Operational 

Programme Human Resources Development 

2007-2013 of the Romanian Ministry of Labour, Family 

and Social Protection through the Financial Agreement 

POSDRU/89/1.5/S/62557. 

This work was supported by the Slovak Research and 

Development Agency under the contracts No. APVV- 

0349-10 and APVV-0194-07, and by grants of the Slovak 

Grant Agency VEGA, projects No. 1/0765/11, 1/0927/11. 

REFERENCES 

[1] N. Yusa, “Development of computational inversion techniques to 

size cracks from eddy current signals,” Nondestructive testing and 

evaluation, vol. 24, pp. 39–52, 2009. 

[2] M. Rebican, Z. Chen, N. Yusa, L. Janousek, and K. Miya, “Shape 

reconstruction of multiple cracks from ECT signals by means of 

a stochastic method,” IEEE Transactions on Magnetics, vol. 42, 

pp. 1079–1082, 2006. 

[3] Z. Chen, M. Rebican, N. Yusa, and K. Miya, “Fast simulation of 

ECT signal due to a conductive crack of arbitrary width,” IEEE 

Transactions on Magnetics, vol. 42, pp. 683–686, 2006. 

[4] N. Yusa, H. Huang, and K. Miya, “Numerical evaluation of the illposedness 

of eddy current problems to size real cracks,” NDT&E 

International, vol. 40, pp. 185–191, 2007. 

[5] L. Janousek, M. Smetana, and K. Capova, “Enhancing information 

level in eddy-current non-destructive inspection,” International 

Journal of Applied Electromagnetics and Mechanics, vol. 33, pp. 

1149–1155, 2010. 

[6] L. Janousek, M. Smetana, and M. Alman, “Decreasing uncertainty 

in size estimation of stress corrosion cracking from eddy-current 

signals,” Studies in Applied Electromagnetics and Mechanics, 

vol. 35, pp. 53–60, 2011. 

[7] L. Janousek, M. Smetana, and M. Alman, “Decline in ambiguity 

of partially conductive cracks depth evaluation from eddy current 

testing signals,” International Journal of Applied Electromagnetics 

and Mechanics, vol. 34, 2012 (in press). 

[8] Z. Chen, K. Miya, and M. Kurokawa, “Rapid prediction of eddy 

current testing signals using A−φ method and database,” NDT&E 

International, vol. 32, pp. 29–36, 1999. 

[9] Z. Chen, K. Aoto, and K. Miya, “Reconstruction of cracks with 

physical closure from signals of eddy current testing,” IEEE 

Transactions on Magnetics, vol. 36, pp. 1018–1022, 2000. 

[10] M. Rebican, N. Yusa, Z. Chen, K. Miya, T. Uchimoto, and 

T. Takagi, “Reconstruction of multiple cracks in an ECT roundrobin 

test,” International Journal of Applied Electromagnetics and 

Mechanics, vol. 19, no. 1-4, pp. 399–404, 2004.


An Adaptive Galaxy-Based Search Approach for 

Electromagnetic Optimization Problems 

* Θ Leandro dos Santos Coelho, Θ Teodoro Cardoso Bora and † Piergiorgio Alotto 

* Industrial and Systems Eng. Graduate Program, Pontifical Catholic University of Parana, Curitiba, PR, Brazil 

Θ Department of Electrical Engineering (PPGEE), Federal University of Parana (UFPR), Curitiba, PR, Brazil 

† Dip. Ingegneria Industriale, Università di Padova, Italy, E-mail: piergiorgio.alotto@dii.unipd.it 

Abstract—Optimization metaheuristics have become very popular methods for electromagnetic device design. The Galaxybased 

search algorithm (GBSA) is a recently proposed algorithm, inspired by the movement of the arms of spiral galaxies in 

outer space. In this work, a standard and an adaptive version of GBSA (AGBSA) based on historic knowledge are applied to 

an analytical testcase and to Loney’s solenoid benchmark problem, showing the suitability of this technique for 

electromagnetic optimization. Furthermore, both algorithmic variants are compared with other well-known stochastic 

optimizers. 

Index Terms— Electromagnetic optimization, Galaxy-based search algorithm, Loney’s solenoid. 


Optimization algorithms which include stochastic 

components are nowadays commonly classified as 

metaheuristics and many of them, e.g. Particle Swarm 

Optimization (PSO), Genetic Algorithms (GA) and 

Evolution Strategies (ES), Differential Evolution (DE), 

just to name a few well-known ones, are known to be 

powerful techniques for the solution of optimization 

problems related to the design of electromagnetic 

devices. Such methods have been studied extensively in 

the last decades with growing interest in recent years (see 

e.g. [1]-[4]). 

A recently introduced metaheuristic which has not yet 

received much attention in the electromagnetic 

optimization community and which is starting to show 

interesting performances in other application areas is the 

the Galaxy-based search algorithm (GBSA) [6],[7]. 

GBSA is a nature-inspired optimization method which 

mimics the movement of the arms of spiral galaxies in 

outer space. 

The objective of this paper is to review the basic 

algorithmic features of the relatively uncommon GBSA 

optimizer and to present a modified and improved 

adaptive GBSA (AGBSA) variant. Both algorithms are 

then tested on Loney’s solenoid benchmark problem [5], 

which features a rough objective function surface typical 

of many electromagnetic problems in which the direct 

problem is solved by numerical methods. 

The rest of this paper is organized as follows. Section 

II provides a detailed description of the GBSA algorithm, 

while section III is devoted to the application of GBSA to 

a multiminima analytical test problem. In Section IV, we 

describe Loney’s solenoid benchmark problem and 

presents the optimization results for the GBSA and 

AGBSA algorithmic variants and comparisons with other 

metaheuristics, finally the paper concludes with a brief 

discussion in Section V. 

II. FUNDAMENTALS OF THE GBSA ALGORITHM 

GBSA searches the input space using a spiral chaotic 

movement approximating the behavior of one arm of a 

spiral galaxy. This movement is driven by a chaotic 

process using a logistic map [7]. The main steps of 

GBSA are given in Fig. 1, where S represents the current 

solution. The algorithm consists of two main 

S ← GenerateInitialSolution 

S ← LocalSearch (S) 

While (termination condition is not met) do 

Flag ← False 

SpiralChaoticMove (S, Flag) 

If Flag then 

S ← LocalSearch (S) 

Endif Endif 

End while while 

Fig. 1. Pseudo code of classical GBSA. 

componentes which are repeated in sequence: 

SpiralChaoticMove, shown in Fig. 2, and LocalSearch, 

shown in Fig. 3. The SpiralChaoticMove has the role of 

searching around the current solution denoted by S. When 

the SpiralChaoticMove procedure finds an improved 

solution, it updates S with the improved solution, and the 

variable Flag is set to true. When Flag is true, the 

LocalSearch component of GBSA is activated in order to 

locally search around the current optimal solution. 

The SpiralChaoticMove component is iterated for 

MaxRep times. However, whenever it finds a solution 

better than the current optimal solution, 

SpiralChaoticMove is terminated and the control of the 

algorithm is transferred to the main procedure of GBSA. 

The SpiralChaoticMove component searches the space 

around the current best solution using a spiral movement 

enhanced by a chaotic variable generated by the logistic 

map: 

(1) 

Where, λ = 4 and x 0 = 0.19 (a sample output for the 

case of two degrees of freedom is given in Fig. 4). It 

should be mentioned that the first 5000 iterations of the 

logistic map are discarded in order not to include in the 

generating sequence the transient motion leading to the 

chaotic attractor. 

The LocalSearch component of GBSA may either 

find a locally optimal solution or it will exceed the 

maximum number of iterations kMax without 

improvement. 

The SpiralChaoticMove component of GBSA is the

input: 

S, the current best solution (Si is the ith component 

i=1:N) 

output: 

SNext is the first found solution better than S. 

Flag if set to true indicates that a better solution has 

been found. 

parameters: 

Each θi is initialised by (–1 + 2 NextChaos()). 

Δθ =0.01. 

r =0.001. 

Δr is set by NextChaos() at each procedure call. 

MaxRep is the maximum number of local iterations in 

SpiralChaoticMove. (e.g. 100) 

θ = –π 

While rep < MaxRep 

For i = 1 to N 

SNexti ← Si + NextChaos() r cos(θi) 

End 

If (f(SNext) ≥ f(S)) then 

Flag ← true 

Return 

Endif 


SNexti ← Si - NextChaos() r cos(θi) 

End 

If (f(SNext) ≥ f(S)) then 

Flag ← true 

Return 

Endif 

r ← r + Δr 


θi ←θi +Δθ 

End 


If(θi > π) then 

θ ← –π 

Endif 

End 

rep←rep +1 

Endwhile 

mechanism which is used for exploring the search space 

Fig. 2. Pseudo-code of the SpiralChaoticMove component of 

GBSA 

in order to find the promising area which may include the 

optimal solution. In contrast, the LocalSearch is the 

GBSA component which is used to explore the promising 

area to find within this area the minimum of the objective 

function. In summary, exploration is conducted by 

SpiralChaoticMove while exploitation is carried out by 

LocalSearch. 

Both exploration and exploitation mechanisms are 

necessary for the success of any metaheuristic: without 

the exploitation mechanism, the metaheuristic may not be 

able to obtain accurate solutions whereas without the 

exploration mechanism, the metaheuristic may get easily 

trapped into a local optimum. 

The advantage of using the LocalSeach component 

with respect to some other exploitation mechanisms, such 

as mutation typical of Genetic Algorithms, is that the 


input: 

S, the current best solution (Si is the ith component 

i=1:N) 

output: 

SNext is a solution better than S 

parameters: 

ΔS is the step size 

α is a dynamic parameter . 

KMax is the maximum number of local iterations in 

LocalSearch. (e.g. 100). 


a←1 

k←0 

while k < kMax 

SLi ←Si –α·ΔS·NextChaos() 

SUi ←Si +α·ΔS·NextChaos() 

If f(SL) < f(S) and f(SU) < f(S) then 

Goto Endrepeat 

Endif 

If f(SU) > f(S) then 

Si ← SUi 

SLi ← Sui 

α ← α + 0.01 × NextChaos() 

k←k+1 

ElseIf f(SL) > f(S) then 

Si ← SLi 

SUi ← SLi α ← α + 0.01 × NextChaos() 

k←k+1 

Else 

α ← α + 0.05 × NextChaos() 

k←k+1 

Endif 

Endwhile 

SLi ← Si 

SRi ← Si Endrepeat 

SNext ← S 

proposed local search never allows the algorithm lose the 

current best solution, thus increasing the greadyness of 

the algorithm. 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

Fig. 3. Pseudo-code of the LoccalSearch component of GBSA 

0.2 

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

Fig. 4. Sample output of the logistic map for the two dimensional 

case

Moreover, also the SpiralChaoticMove never loses the 

best solution found so far. In addition, it uses chaotic 

movements in order to reduce the chance of getting 

trapped into local optima (although this may still happen 

as will be seen in the paragraph devoted to the analytical 

test case). Due to the chaotic process, GBSA does not 

return to the same solution and thus diversity of the found 

solutions is kept high. Keeping diversity high is 

obviously especially important in dealing with 

multimodal problems. 

The tuning of the step size ΔS in the LocalSearch 

procedure is a very delicate task in the classical GBSA 

and the convergence properties of the method strongly 

depend on its specific value, which is also problemdependent. 

The proposed AGBSA efficiently tunes the 

step size using history knowledge of mean distances for 

the best solution in the previous iteration. The use of 

history knowledge is typical of cultural algorithms [9]. 

III. ANALYTICAL BENCHMARK 

The analytical benchmark refers to minimization 

of the so-called six-hump camel back function 

/3+ 

. The function has features which are typical of 

many real problems, i.e. a bowl-shaped large-scale 

behavior, shown in Fig. 5a, which incorporates a 

relatively flat plateau with a rather rough small-scale 

behavior with several local minima, shown in Fig. 5b. 

The function has two global minima, i.e. at [- 

0.089842, 0.712656] and [0.089842, -0.712656] with 

value f=-1.031628453 and an additional four local 

minima. 

f(x1,x2) 

f(x1,x2) 

−50 

2 

6 

5 

4 

3 

2 

1 

0 

−1 

2 

200 

150 

100 

50 

0 

1 

1 

0 

x2 

0 

x2 

−1 

−1 

−2 

−2 

−2 

−4 

Fig. 5: a) large-scale and b) small-scale behavior of the six-hump camel 

back function 

−1 

−2 

x1 

x1 

0 

0 

2 

1 

4 

2 

160 

140 

120 

100 

80 

60 

40 

20 

0 

5 

4 

3 

2 

1 

0 


Fig 6. shows the position of the optimal 

solutions obtained over 30 runs of the algorithm with 

maximum number of objective function evaluation set to 

100 (Fig. 6a) and 500 (Fig. 6b). 

The picture clearly shows that even with a rather 

small number of function evaluations, the areas of the 

global minima are usually correctly identified by the 

algorithm, while a larger number of function evaluations 

allows a good precision. The ability of the algorithm to 

escape local minima is also clearly shown, since none of 

the runs terminated in one of the four local minima at a 

higher number of function evaluations, while some 

trapping in local minima is to be seen with a lower 

number of function evaluations. 

1 

0.8 

0.6 

0.4 

0.2 

0 

−0.2 

−0.4 

−0.6 

−0.8 

−1 

1 

0.8 

0.6 

0.4 

0.2 

0 

−0.2 

−0.4 

−0.6 

−0.8 

−1 

−1.5 −1 −0.5 0 0.5 1 1.5 

−1.5 −1 −0.5 0 0.5 1 1.5 

Fig. 6: Optimal solution over 30 runs : a) max. 500 function evaluations b) 

max. 3000 function evaluations 

IV. LONEY’S SOLENOID DESIGN 

The objective in Loney’s solenoid benchmark problem 

[5] is to produce a uniform magnetic flux density B within 

a given interval on the axis of a main solenoid (Fig. 7). The 

problem is described by two degrees of freedom (the 

separation s and the length l of the correcting coils) both 

constrained by box bounds

Fig. 7. Axial cross-section of Loney’s solenoid (upper half-plane). 

Three different basins of attraction can be recognized in 

the domain of the objective function F with values of F > 

4·10 -8 (high level region: HL), 3·10 -8 < F < 4·10 -8 (low 

level region: LL), and F < 3·10 -8 (very low level region - 

global minimum region: VL). The very low level region is 

a small ellipsoidally shaped area within the thin low level 

valley. In both VL and LL areas, small changes in one of 

the parameters result in changes in objective function 

values of several orders of magnitude, as shown in Fig. 8. 

In the numerical tests, a stopping criterion of 3000 

objective function evaluations in each run was used. Tables 

I and II show the results over 30 runs. Table II also shows a 

comparison with other metaheuristics [10], [11]. 

It can be noticed that the proposed improvement allows 

GBSA to become almost as good as some other wellknown 

stochastic optimizers, especially as far as the best 

solution is concerned, while improvements in the standard 

deviation and mean value are still required. 

Fig. 8. Objective function landscape and detail of the VL area 

Optimization 

Method 

TABLE I 

SIMULATION RESULTS OF F IN 30 RUNS 

F(s, l)·10 -8 

Maximum Mean Minimum StandardD 

(Worst) 

(Best) eviation 

GBSA 1510.646 95.960 2.121 282.727 

AGBSA 1510.631 88.313 2.068 283.834 

SOMA [10] 3.8761 3.2671 2.0595 0.5078 

Tribes [11] 3.9526 3.4870 2.0574 0.5079 


TABLE II 

BEST SOLUTIONS FOR LONEY’S SOLENOID IN 30 RUNS 

Optimization separation length F(s, l)·10 

Method s (cm) l (cm) 

-8 

GBSA 11.8212 1.6519 2.121 

AGBSA 1.6072 1.5145 2.068 

V. CONCLUSION 

This paper proposes an improved AGBSA based on 

historic knowledge. Results on Loney’s solenoid design 

problem, a benchmark featuring many of the 

characteristics of typical electromagnetic design problems 

show promising results. Since the main remaining 

weakness of the algorithm, compared with other 

metaheuristics, is a rather large standard deviation of 

optimal solutions, future research will be targeted at 

introducing additional improvements in order to decrease 

the spread of solutions. 

ACKNOWLEDGMENTS 

This work was supported by the National Council of 

Scientific and Technologic Development of Brazil — 

CNPq — under Grant 476235/2011-1/PQ. 

REFERENCES 

[1] N. Al-Aaawar, T. M. Hijazi, and A. A. Arkadan, “Particle swarm 

optimization of coupled electromechanical systems,” IEEE 

Transactions on Magnetics, vol. 47, no. 5, 2011, pp. 1314-1317. 

[2] G. Crevecoeur, P. Sergeant, L. Dupré, and R. Van de Walle, “A 

two-level genetic algorithm for electromagnetic optimization,” 

IEEE Transactions on Magnetics, vol. 46, no. 7, 2010, pp. 2585- 

2595. 

[3] K. Watanabe, F. Campelo, Y. Iijima, K. Kawano, T. Matsuo, T. 

Mifune, and H. Igarashi, “Optimization of inductors using 

evolutionary algorithms and its experimental validation,” IEEE 

Transactions on Magnetics, vol. 46, no. 8, 2010, pp. 3393-3396. 

[4] P. Alotto. A hybrid multiobjective differential evolution method 

for electromagnetic device optimization. COMPEL, Vol. 30, No. 

6, 2011, pp.1815 – 1828. 

[5] P. Di Barba and A. Savini, “Global optimization of Loney’s 

solenoid by means of a deterministic approach,” Int. J. of Applied 

Electromagnetics and Mechanics, vol. 6, no. 4, pp. 247-254, 1995. 

[6] H. Shah-Hosseini, “Principal components analysis by the galaxybased 

search algorithm: a novel metaheuristic for continuous 

optimization,” Int. J. of Comp. Sci. and Eng., vol. 6, no. 1-2, pp. 

132-140, 2011. 

[7] H. Shah-Hosseini, “Otsu’s criterion-based multilevel thresholding 

by a nature-inspired metaheuristic called galaxy-based search 

algorithm,” Proc. of 3rd World Congr. on Nature and Biologically 

Inspired Computing, Salamanca, Spain, pp. 383-388, 2011. 

[8] G. Ciuprina, D. Ioan and I. Munteanu, “Use of intelligent-particle 

swarm optimization in electromagnetics,” IEEE Transactions on 

Magnetics, vol. 38, no. 2, pp. 1037-1040, 2002. 

[9] R. L. Becerra and C. A. C. Coello, “Cultural differential evolution 

for constrained optimization,” Comp. Methods in Appl. Mechanics 

and Engineering, vol. 195, no. 33-36, pp. 4303-4322, 2006. 

[10] L. S. Coelho and P. Alotto, “Electromagnetic optimization using a 

cultural self-organizing migrating algorithm approach based on 

normative knowledge,” IEEE Trans. on Magnetics, vol. 45, no. 3, 

pp. 1446-1449, 2009. 

[11] L. S. Coelho and P. Alotto, “Tribes optimization algorithm applied 

to the Loney’s solenoid,” IEEE Trans. on Magnetics, vol. 45, no. 

5, pp. 1526- 1529, 2009.

Abstract—Finite element modeling of a magnetic circuit used in 

automotive technologies is presented. A 3D magnetic analysis 

was performed in order to calculate the field distribution on the 

surface of giant magnetoresistance (GMR) sensors. Model 

results were compared with experiments, which showed good 

agreement. The validated model was further used to optimize 

the magnetic circuit design and to improve the working 

performance sensors. 

Index Terms— Sensors, Finite element method, Magnetic 

circuits, Magnetic fields, Giant Magnetoresistance.. 


Magnetic sensors play an important role in automotive 

applications. They are reliable, cost effective with high 

performance and provide contactless measurements. They are 

majorly employed for applications such as measuring pedal 

position, engine transmission control, rotational speed of the 

wheels, and for anti-lock braking system (ABS) [1]. 

A new type of magnetic sensor based on the Giant 

Magneto-resistance phenomenon (GMR) was developed by 

Infineon [2]. They offer key benefits such as high sensitivity, 

linear operation over the sensing range, good temperature 

stability over a wide range and low field detection 

capabilities. Therefore, they are capable of being more precise 

on measuring the position or operating at large distances 

from the gear wheel in applications. Another benefit of using 

GMR elements is the low resistance noise. Presently, GMR 

sensors can be used in small fields such as 10 nT at 1 Hz and 

up to 10 8 nT. They can operate under temperatures between - 

55°C up to 150°C. Unfortunately due to GMR’s high 

sensitivity and their low field detection capability GMR 

elements can easily drive on saturation, if the detected 

magnetic field reached a crucial strength value. Therefore it 

is very important to ensure that GMR sensors always stay in 

their linear range. 

This can be achieved using an experimental procedure to 

measure the field distribution of the magnetic circuit. 

Unfortunately the experimental method is time consuming 

and cost expensive. In order to overcome these problems, this 

paper presents a model development of GMR sensors based 

on finite element method which can predict the field strength 

on the surface of GMR elements with high accuracy. 


Implementation of a 3D magnetic circuit model 

for automotive applications 

Ioannis Anastasiadis 1, 3 , Andreas Buchinger 1 , Tobias Werth 2 ,Lukas Bellwald 1 and Kurt Preis 3 

1 KAI Kompetenzzentrum Automobil- und Industrieelektronik GmbH, Europastrasse 8, Villach, 9524 Austria 

2 Infineon Technologies Austria AG, Siemensstrasse 2, 9500 Villach, Austria 

3 Institute for Fundamentals and Theory in Electrical Engineering, Kopernikusgasse 24/3, A-8010 Graz, Austria 

II. GMR SENSOR CONCEPT 

Magnetoresistance is the change in resistance of a 

ferromagnetic material caused by an external magnetic field. 

The measure of magnetoresistance is usually given by the 

ratio ΔR/R, where R is the resistance for zero magnetic field 

and ΔR is the change in resistance when magnetic field 

changes by an amount ΔH. Usually ΔR/R value is small and 

hence the change in DC voltage remains low. In applications 

by using a Wheatstone bridge configuration to place the 

magnetoresistance elements, it is possible to minimize the DC 

offset. 

In some cases of ferromagnetic multilayer’s stack (Fe/Cr)n, 

it was reported that at low temperatures their resistance can 

change up to 50 % [3]. Due to this major change in 

resistance, this phenomenal behavior was named as Giant 

Magneto-Resistance (GMR). GMR elements consist of a 

sequence of ferromagnetic and antiferromagnetic layers, 

which drastically change their resistance under an external 

magnetic field. The simplest GMR technology structure is the 

spin valve consisting of three layers, of which two 

ferromagnetic layers are separated by an antiferromagnetic 

layer [4]. One layer has a fixed magnetization direction called 

pinned layer-hard layer and the other layer is free to rotate 

with external fields and magnetization direction, termed as 

free layer-soft layer. For industrial use, this pinned layer can 

be created in two ways. The first, using the current flow 

which provides heating to the layer and the second method is 

to use laser pulses for heating the selected layer. During 

cooling, the pinned magnetization is formed. In sensor 

technology, the pinned layer has its magnetization direction 

perpendicular to the free axis of the free layer. This setup 

gives a linear response of the change in GMR resistance when 

an external magnetic field is applied. 

GMR is a quantomechanic phenomenon created due to the 

orientation of conducting electrons while they pass through 

the GMR stack. If the spin orientation of the electrons is 

parallel to the magnetic orientation of the layer, they move 

freely and the resistance remains low. If the spin orientation 

is antiparallel to the orientation of the layer, resistance 

increases due to collisions with the atoms of the layers. For 

application purposes, the trigger/external field should have a 

magnitude bigger than the saturation field of the free layer

and smaller than the standoff field of the pinned layer. If this 

is not the case, then the magnetization direction of the layers 

will be affected, which will change the overall characteristics 

of the magnetic sensor. The equation that describes the 

change in the resistance R of the GMR element is related to 

the angle θ between the magnetization directions of the free 

and pinned layer. In the simplest form of the GMR elements 

the change in resistance is proportional to the cosine of angle 

θ between the magnetization layers [5]: 

R − R|| 

ΔR 

ΔR 

− ( 1− 

m1 

⋅ m2 

) = [ 1− 

cosθ 

] (1) 

R|| 

2 R|| 

2R|| 

Where R is the resistance of the stack, R|| is the resistance 

of the stack in the parallel state, ΔR is the difference between 

the resistance of the stack in parallel and antiparallel state, 

m1 and m2 are the unit magnetization vectors. 

The magnetic-sensor consists of 4 GMR elements situated 

at the two edges of the sensor chip. Typical size of the GMR 

stacks is approximately 1 μm in length, with 1 mm depth and 

a thickness of few nanometers. They are connected to each 

other with a Whitestone bridge configuration to measure the 

speed signal. In addition, an extra GMR element is placed at 

the center of the IC to calculate the directional movement. 

The GMR element configuration is shown in figure 1. 

Fig. 1: GMR element configuration 

By using the above bridge configuration, it is possible to 

compensate the DC-offset signal coming from the magnetic 

sources. The output signal is given by the following equation: 

R4 

R2 

Vsign = Vleft 

−Vright 

= VDD 

−VDD 

R + R R + R 

3 4 

2 1 

≈ Bxleft − Bxright 

(2) 

Calculating the magnetic field will provide an indication of 

the sensor output signal. The above equation is valid, since 

the change in resistance of the GMR elements has a linear 

response with the change in magnetic field. This assumption 

is correct for fields around zero, but for larger applied fields 

the overall characteristics of GMR elements will change as 

they will saturate. 

≈ 


III. MAGNETIC CIRCUITS 

Typical magnetic circuits used in automotive technologies 

consist of a gear wheel, sensor and a magnet. This magnet, 

termed as back-bias magnet is the source for the circuit. The 

sensor and the back-bias magnet are fixed, while the wheel is 

subjected to rotation. The ferromagnetic gear wheel acts as an 

accumulator of the magnetic field –passive target- and the 

fluxes bend according to the position of the gear wheel, either 

if the static part of the circuit faces a tooth or not. This 

difference of the field distribution is sensed by the GMR 

elements and is transformed to an electrical signal as the 

output of the magnetic sensor. A schematic of this typical 

circuit is shown in Figure 2. 

magnetic sensor 

Fig.2. Basic magnetic circuit application 

back-bias magnet 

Previously, investigations and optimization process for 

back-bias magnets and gear wheels geometries was carried 

out for GMR magnetic sensors applications. [6, 7]. This paper 

presents the investigation and model development of the 

circuit as shown in Figure 3. The gear wheel consists of 44 

teeth with a circular pitch of 8°.18. The gear wheel is 10 mm 

long in y-axis dimension. The back-bias magnet structure 

consists of two magnets formed together with magnetization 

directions on the xz plane tilted at an angle of 20° in the z 

axis as shown in figure 3. The dimensions of the magnet are 

10 x 10 x 4 mm. The magnet is a ferrite with a remanence of 

287 mT. 

y 

z 

x 

3 mm 

20° 20° 

3mm 

airgap 

4 mm 

3mm 

Fig.3: Magnetic circuit under investigation 

The magnetic sensor is placed between the gear wheel and 

the magnet. The distance from the end of the sensor to the top 

of a tooth is the airgap distance of the circuit. When the gear 

wheel is rotated, change in magnetic field distribution on the

GMR element surface takes place. By the rotation of the 

wheel with respect to the MS location and for a distance of 

one pitch, the field distribution along x-axis has a sinusoidal 

form. It is of interest to calculate this field distribution and to 

compare with experimental results. 

IV. MODEL CREATION 

The field distribution on the xy plane along the surface of 

the GMR elements was investigated. Additionally, it is 

necessary to check also the field strength in the normal 

direction (z-direction) in order to determine the sensor circuit 

response due to change in the airgap distance. For the above, 

it is necessary to investigate the magnetic circuit’s field 

distribution in three dimensions (3D). Because of the 

complexity of the problem, it is not possible to derive an 

analytical 3D solution. Therefore, finite element method was 

used to for this purpose [8]. Within this method the geometry 

of the problem is discretized in smaller regions, where the 

field distribution is calculated by means of approximated 

polynomial shape functions. The approach of the problem is 

bottom-up. Model was first created in two dimensions and 

then extracted in the third direction. 3D scalar magnetic 

element was used for this model. Figure 4 shows the field 

distribution for an airgap of 2 mm. 

Fig.4: field distribution for an airgap of 2 mm 

For the model creation, back-bias magnet and the gear 

wheel were created surrounded by air. Since the GMR 

elements do not interfere in the field distribution but only 

used to measure the magnetic field they are ignored in the 

model. Precautions have to be made for the magnet 

surrounding free space. The dimensions should be taken 

around 5 times the respective dimensions of the magnet for 

convergence reasons. Larger the model size will increase the 

computational time and smaller may lead to distorted 

calculations of the field due to calculation errors. All 

simulations reported in the paper were carried out using a 

commercial FEM tool [9]. 

A. MS attached to back-bias magnet 

Initially, investigations for the case where magnetic sensor 

is attached to the back-bias magnet were performed. The air 

gap was 2 mm. For a rotation of 1 pitch, the field distribution 


was calculated on the surface of the GMR elements. The Bx 

field distribution on the surface of the left GMR and center 

GMR element measured at a point in the center of their 

surface and for a rotation of 1 pitch is shown in figure 5. 

Bx(mT) 

Bx(mT) 

0 

-2 

-4 

-6 

-8 

-10 

-12 

-14 

2 

1.5 

1 

0.5 

0 

-0.5 

-1 

-1.5 

-2 

Bx field on the left GMR 

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.18 

angle(°) 

Fig. 5a:Bx field distribution on the left GMR 

Bx field on the center GMR 

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.18 

angle(°) 

Fig. 5b:Bx field distribution on the center GMR 

As it can be seen from Figure 5a, the Bx field on the left 

GMR element is approximately -10 mT which is big enough 

to drive GMR elements on saturation. Therefore MS should 

not be attached to the back-bias magnet but kept a distance 

between MS and magnet to prevent driving GMR elements on 

saturation. 

B. MS placed in a distance from the magnet 

In this case, a distance of 2 mm is kept between the 

magnet and the magnetic sensor as shown in figure 6. 

20° 

20° 

Fig.6: The new circuit geometry under investigation 

By setting the airgap distance also to 2 mm and considering 

that the package of the sensor has in normal direction 1 mm 

length, the total distance between the magnet back face and 

gear wheel teeth was 5 mm. 

y 

z 

x

Hence magnetization of the free layer of GMR stack follows 

the magnetization direction of the external plane field 

distribution on the surface of the element. We calculated the 

plane field Bx and By distribution along the surface of GMR 

stripes. GMR stripes have a length of approximately 1 mm in 

y-direction. For investigations of the plane field at the length 

of GMR stripe along y-direction, the Bx and By distributions 

were calculated at points which were equidistant spaced. The 

bottom point at the surface of GMR stripe is denoted to be the 

0 point and the next point is spaced by 0.05 mm until the last 

point of calculations on the top point of the stripe. The Bx and 

By filed distributions for those points and for two GMR 

elements, one which is located on the left half-bridge and 

another on the center of the sensor (GMR5) can be seen on 

figures 7 and 8. 

Bx(mT) 

By(mT) 

Bx(mT) 

1 

0.5 

0 

-0.5 

-1 

-1.5 

-2 

-2.5 

-3 

-3.5 

2 

1.5 

1 

0.5 

0 

-0.5 

-1 

-1.5 

-2 

0 

1 

1.5 

2 

2.5 

3 

3.5 

angle(°) 

4 

4.5 

5 

5.5 

6 

6.5 

7 

7.5 

8 

8.1818 

Bx @ 0mm 

Bx @ 0.05mm 

Bx@ 0.1mm 

Bx@ 0.15mm 

Bx@ 0.2mm 

Bx@ 0.25mm 

Bx@ 0.3mm 

Bx@0.35mm 

Bx@0.4mm 

Bx@0.45mm 

Bx@0.5mm 

Bx@ 0.55mm 

Bx@ 0.6mm 

Bx@ 0.65mm 

Bx@ 0.7mm 

Bx@ 0.75mm 

Bx@ 0.8mm 

Bx@ 0.85mm 

Bx@ 0.9mm 

Bx@ 0.95mm 

Bx@ 1mm 

Fig. 7a: Bx field distribution on a GMR on the left half-bridge 

0 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.182 

angle(°) 

By@ 0mm 

By@ 0.05mm 

By@ 0.1mm 

By@ 0.15mm 

By@ 0.2mm 

By@ 0.25mm 

By@ 0.3mm 

By@ 0.35mm 

By@ 0.4mm 

By@ 0.45mm 

By@ 0.5mm 

By@ 0.55mm 

By@ 0.6mm 

By@ 0.65mm 

By@ 0.7mm 

By@ 0.75mm 

By@ 0.8mm 

By@ 0.85mm 

By@ 0.9mm 

By@ 0.95mm 

By@ 1mm 

Fig. 7b: By field distribution on a GMR on the left half-bridge 

2 

1.5 

1 

0.5 

0 

-0.5 

-1 

-1.5 

-2 

0 

1 

1.5 

2 

2.5 

3 

3.5 

4 

4.5 

5 

5.5 

6 

6.5 

7 

7.5 

8 

8.18 

angle(°) 

Bx@ 0mm 

Bx@ 0.05mm 

Bx@ 0.1mm 

Bx@ 0.15mm 

Bx@ 0.2mm 

Bx@ 0.25mm 

Bx@ 0.3mm 

Bx@ 0.35mm 

Bx@ 0.4mm 

Bx@ 0.45mm 

Bx@ 0.5mm 

Bx@ 0.55mm 

Bx@ 0.6mm 

Bx@ 0.65mm 

Bx@ 0.7mm 

Bx@ 0.75mm 

Bx@ 0.8mm 

Bx@ 0.85mm 

Bx@ 0.9mm 

Bx@ 0.95mm 

Bx@ 1mm 

Fig. 8a: Bx field distribution on a center GMR element 


By(mT) 

2 

1.5 

1 

0.5 

0 

-0.5 

-1 

-1.5 

-2 

0 

1 

1.5 

2 

2.5 

3 

3.5 

4 

4.5 

5 

5.5 

6 

6.5 

7 

7.5 

angle(°) 

8 

8.1818 

By@ 0mm 

By@ 0.05mm 

By@ 0.1mm 

By@ 0.15mm 

By@ 0.2mm 

By@ 0.25mm 

By@ 0.3mm 

By@ 0.35mm 

By@ 0.4mm 

By@ 0.45mm 

By@ 0.5mm 

By@ 0.55mm 

By@ 0.6mm 

By@ 0.65mm 

By@ 0.7mm 

By@ 0.75mm 

By@ 0.8mm 

By@ 0.85mm 

By@ 0.9mm 

By@ 0.95mm 

By@ 1mm 

Fig. 8b: By field distribution on a center GMR element 

The Bx fields along the GMR stripes have always the same 

response for a rotation of 1 pitch. On the other hand the By 

field is shifted each time we move from the bottom side of the 

stripe towards the upper side. The By fields are 

homogeneously distributed along the stripes and have the 

same values on the surfaces of all GMR stripes. Due to 

symmetry reasons, a GMR element situated on the right halfbridge 

should also have along their surface similar By and Bx 

distribution. 

C. Experimental results 

Experiments were performed for the configuration shown in 

figure 6. To compare with simulations, the gear wheel speed 

was set to 1.5 rpm. Such a small speed was chosen, because 

the finite element model was built for every corresponding 

magnet positions over the rotated angle. As the gear wheel 

rotates, the magnetic sensor provides speed and the 

directional signal. These measured signals are compared with 

simulation results. Drawback of the experimental procedure is 

that there is no possibility to directly derive the Bx and By 

distribution along the GMR stripe, but only the plane field 

distribution on the surface of GMR elements while gear wheel 

is rotated towards magnetic sensor. This experimental field 

distribution along the GMR surface is compared with the field 

distributions derived from the model. 

Hence Bx and By distributions are changed along the GMR 

stripes we have to calculated the average field that the 

elements are sensed and that field we have to compare it with 

the experimental results. Substituting the signal from the left 

and right Whetstone bridge, we measure the speed signal 

while the center GMR element shows the directional signal. 

The results are shown in figure 9. The signal is calculated on 

field distribution along the surface of the stripes. 

Fig. 9a: comparison of directional signal

Fig. 9b: comparison of speed signal 

Again, results are shown for a rotation of 1 pitch. 

Comparisons for the speed signal between experimental 

and simulation results reveal a small deviation of 

approximately 3%. On the other hand, the comparison for the 

directional signal which comes from the measurements of the 

middle GMR element shows a bigger deviation with a mean 

value of 9%. Differences between simulation and 

experimental results are due to inaccuracies in the simulation 

model, such as how dense the model is. Another important 

issue is that in reality the geometry of gear wheel has 

deviations from the theoretical geometry due to construction 

reasons, for example each pitch distance has not exactly the 

same dimensions of 6mm or there may be a small deviation 

on the height of all the teeth of the gear wheel. Those 

problems can bring an error on the calculated signal. 

V. CONCLUSION 

A 3D model describing the rotation of a GMR sensor 

around a gear wheel was developed and verified with 

experiments. The model is based on finite element analysis 

and is used to calculate the variations of the field distribution, 

when the gear wheel is rotated around the stator of the 

magnetic circuit, MS and back-bias magnet. Calculation of 

the field distribution was performed along the GMR element 

surface. In parallel, experiments were performed for the same 

configuration to support model development. 

Comparison shows a small deviation between the compared 

values given an indication of the valid of the model. Such a 

model can give us a fast and accurate estimation on the 

magnetic circuit’s functionality showing also the maximum 

airgap performance of this circuit. 


This work was jointly funded by the Federal Ministry of 

Economics and Labour of the Republic of Austria (contract 

98.362/0112-C1/10/2005) and the Carinthian Economic 

Promotion Fund (KWF) (contract 98.362/0112-C1/10/2005). 

REFERENCES 

[1] C.P.O. Treutler, “Magnetic sensors for automotive applications,” Elsevier, 

Sensors and Actuators A, vol. 91,2001, pp. 2-6 

[2] Dirk Hammersdchmidt, Ernst Katzmaier, et. all, “Giant magneto resistorssensor 

technology & automotive applications”, SAE 2005 World Congress 

& Exhibition, SAE, Detroit, 01-01-2005, pp. 1-16. 


[3] M.N.Baibich, M.Broto, A. Fert, F. Nguyen Van Dau, F. Petroff, P. 

Etienne, G. Creuzei, A. Frederick and J. Chazelas, “Giant 

Magnetoresistance of (001) Fe/(001) Cr Magnetic Superlattices,” Phys. 

Rev. Lett., vol. 61, num. 21, Nov. 1988, pp. 2472-2475. 

[4] Robert L. White, “Giant Magnetoresistance: A Primer” IEEE Trans. on 

Magn., vol. 28, Sept. 1992, pp. 2482-2487. 

[5] S.E. Russek, R.D: McMichael, M.J: Donahue and S. Kaka, “High Speed 

Switching and Rotational Dynamics in Small Magnetic Thin Film 

Devices,” Springer, Spin Dynamics in Confined Magnetic Structures 2, 

vol.87, 2003, pp. 93-156 

[6] I. Anastasiadis, T. Werth, K. Preis, “Evaluation and optimization of backbias 

magnets for automotive applications using Finite Element Methods”, 

IEEE Transactions on Magnetics, vol. 45 (March 2009) no.3, pp. 1332- 

1335. 

[7] I. Anastasiadis, T. Werth, K. Preis, “Investigation and optimization of 

magnetic sensor gear wheels for automotive applications”, 14 th IGTE 

Symposium on Numerical Field Calculation in Electrical Engineering, 

19-22 Sept. 2010, submitted. 

[8] A. Bonderson,T. Rylander, P. Ingelström, Computational 

Electromagnetics (Book style), Springer Inc. 2005. 

[9] Tutorial, Electromagnetic Field Analysis Guide, ANSYS Release 11.0, 

ANSYS Inc Book International Inc., Canonsburg, PA, 2006








Mixed Order Edge-based Finite Element Method 

by Means of Nonconforming Mesh Connection 

Yoshifumi Okamoto and Shuji Sato 

*Department of Electrical and Electronic Systems Engineering, Utsunomiya University 

7-1-2 Yoto, Utsunomiya, Tochigi 321-8585, Japan 

E-mail: okamotoy@cc.utsunomiya-u.ac.jp 

Abstract—The first order element is widely used as discretized order of edge-based finite element method. The second order 

discretization has a tendency of escape from the practical magnetic field analysis for the reason of many nonzero entries in 

global matrix and convergence deterioration of a linear solver. Therefore, the performance of higher order elements should 

be successfully utilized by the restriction of analyzed region. This paper presents the mixed order finite element method with 

reasonable computational costs by using nonconforming mesh connection. The higher order discretization is applied to the 

main region with higher accuracy, and is connected with the outer space discretized by first order element using 

nonconforming connection. This paper shows the detailed characteristics of nonconforming mixed order edge-based finite 

element analysis. 

Index Terms— Higher order element, higher order interpolation, mixed order edge-based finite element analysis, 

nonconforming mesh connection. 

linear combination [5] – [8]. We applied the linear 

combination to nonconforming connection to retain the 

symmetry of the global matrix. Furthermore, we newly 

propose the finite element analysis using the 

nonconforming connection between 2nd order elements, 

and comparisons have been made with conventional 

conforming analysis and mixed-order analysis. 


The edge-based finite element method is widely 

recognized as a powerful and practical numerical method 

in the design environment for the evaluation of temporal 

changes of electromagnetic field and various 

characteristics of electrical machine. Furthermore, the 

mesh generator has been advanced in order to product the 

finite element mesh for the complicated target. From 

these technical background, 1st order discretization is 

widely adopted as an edge-based finite element method. 

On the other hand, the 2nd order discretization has the 

tendency of escape from the practical magnetic field 

analysis owing to the many nonzero entries in global 

matrix and convergence deterioration of a linear solver 

for the algebraic equation. However, 2nd order element 

has better convergence characteristics to the true value of 

physical quantity such as the magnetic energy than the 

1st order element when the element size is shortened. 

Then, it is assumed that 2nd order element should be 

adopted at the region where the accuracy is required and 

1st order element is adopted at other region. This 

combinatorial technique might be capable of improving 

the convergence characteristics of linear solvers and 

shortening the elapsed time with lower computational 

cost than conventional 2nd order discretization. The 

reference [1] describes the mixed order analysis based on 

the hierarchical elements. Furthermore, the mixed order 

analysis in the discontinuous Galerkin method is applied 

to the eddy current problem [2]. These references show 

the effectiveness of mixed order analysis. However, the 

combinatorial technique between higher order elements is 

not reported and the degree of nonconforming mesh 

connection between 2nd and 1st order is not verified. 

This paper shows the detailed effectiveness of mixed 

order edge-based finite element analysis which is realized 

using nonconforming mesh connection. The 

nonconforming mesh connection is mainly classified into 

two methods, in which one is the method based on 

Lagrange multiplier [3], [4], and the other is based on the 

II. MIXED ORDER FINITE ELEMENT METHOD BY MEANS 

OF NONCONFORMING MESH CONNECTION 

A. Weak form for edge-based finite element method 

The weak form Gi for Maxwell equation in 

magnetostatic field is given as follows: 

G 

i 

 

 

V 

( 

N ) ( 

 

A) 

dV N J 0dV 

i 

 

 

Vm 

 

Vc 

( N ) 

B dV 0, 

where Ni is the edge-based shape function, is the 

reluctivity, A is magnetic vector potential, J0 is the 

current density vector, Br is the remanence, respectively. 

The domain for volume integral V, Vc, and Vm denote the 

whole region, the region for magnetizing winding, and 

the permanent magnet, respectively. When the magnetic 

nonlinearity is taken into account, Newton-Raphson (NR) 

method supported by the line-search based on functional 

(0, 1.0) [9] is adopted as the nonlinear analysis method. 

The ICCG method with shifted parameter [10] is used as 

a linear solver for the algebraic equation derived from 

edge-based finite element method. 

B. Nonconforming mesh connection 

This subsection describes the nonconforming mesh 

connection using the linear combination, which is 

formulated as follows: 

 

i 

i 

r 

(1) 

A N A dl 

, 

(2) 

b 

ab 

a 

k 

k 

k 

where Aab is the vector potential of nonconforming edge 

ab as shown in Figure 1 (a). In following formulation, 

suppose that the global coordinate at the node k is (xk, yk),

the global coordinate of node a and b is (xa, ya), (xb, yb) 

and the local coordinate of node a and b is (a, a), (b, 

b). The element shape on master side is assumed as a 

square or rectangle. Hence, the global coordinate (x, y) 

on master side element is as follows: 

x1 

x2 

x1 

x2 

x , 

(3) 

2 2 

y1 

y4 

y1 

y4 

y , 

(4) 

2 2 

where and are local coordinate on the master side 

element. , is given as follows: 

b 

 

a 

 

i , 

(5) 

x x 

b 

a 

b 

 

a 

 

j , 

(6) 

yb ya 

where i and j are the unit vectors in x- and y-direction. 

The linear combination using 1st and 2nd order mesh on 

the master side is mentioned below. 

Coefficients for the 1st order nonconforming mesh 

connection 

Adopting the 1st order element as the discretization of 

master side, (2) becomes the expression: 

4 

A ab I ke Ake 

, 

(7) 

k 1 

where Ike is a coefficient which is evaluated by a line 

integral of edge-based shape function Nke along the edge 

ab as follows: 

b 

1 

dl 

I ke N d ( ) d 

, 

a 

ke l N 

1 

ke tab 

d 

(8) 

where tab is the unit vector in the direction ab, dl/d is the 

Jacobian,and is an additional parameter to perform the 

analytical line integrals, respectively. Using , the local 

coordinates and are transformed into 

b 

 

a a b 

, 

2 2 

(9) 

b 

 

a a 

 

b 

. 

2 2 

(10) 

Subsequently, tab and dl/d for the analytical evaluation 

of (8) are as follows: 

1 

tab { ( xb 

xa 

) i ( yb 

ya 

) j} 

, 

l 

(11) 

2e 

N 4e 

ab 

a 

l ab 

N 1e 

b 

1e 

N 3e 

3e N2e 4e 

2e 

N 7e 

8e 

N 8e 

a 

N 2e 

N 9e 

l ab 

5e 

N 10e 

N 1e 

b 

1e 

N 5e 

7e 

N 6e 

3e N4e 6e N3e 4e 

(a) (b) 

Figure 1. Interpolation to the slave edge ab from master 

side element. (a) 1st order master side element and (b) 

Serendipity 2nd order master side element. 


2 

dl 

 

d 

x 

 

 

 

 

 

 

 

 

 

x 

 

 

 

 

 

 

 

 

 

lab 

, 

2 

(12) 

where lab is the length of edge ab. Then, the 1st order 

edge-based shape functions on master side surface are as 

follows: 

1 

( 1 

ke 

) 

N 4 

ke 

1 

( 1 

ke 

) 

4 

( 1 k 2) 

, 

( 3 k 4) 

(13) 

where ke and ke are the local coordinates on the edge k 

according to TABLE I. Substituting (11), (12), and (13), 

are substituted for (8), all components of Ike are 

analytically evaluated as follows: 

I 

ke 

1 

( b 

 

a ){ 2 

ke ( a 

 

b )} 

8 

 

1 

( b 

 

a ){ 2 ke ( a b 

)} 

8 

2 

( 1 

( 3 

TABLE I 

LOCAL COORDINATES FOR 1ST ORDER 

EDGE-BASED SHAPE FUNCTION ON MASTER SIDE 

k 1 2 3 4 

ke 0 0 1.0 -1.0 

ke 1.0 -1.0 0 0 

k 2) 

. (14) 

k 4) 

Coefficients for the 2nd order nonconforming mesh 

connection 

Even in the case of the linear combination adopting the 

2nd order elements as the master side mesh, the 

derivation of the coefficients for linear combination is in 

the same way as 1st order nonconforming connection. 

Firstly, 2nd order edge-based shape functions of 

Serendipity type [10] on the master side surface shown in 

Figure 1 (b) as follows: 

1 

( 1 

ke 

) ( 4 

ke 

 

ke 

) 

( 1 k 4) 

4 

1 

( 1 

ke 

) ( 4 

ke 

ke 

) 

( 5 k 8) 

4 

N ke 

, (15) 

1 2 

( 1 

) 

( k 9) 

2 

1 2 

( 1 

) 

( k 10) 

2 

where ke and ke are the local coordinates on the edge k 

according to TABLE II. Then, adopting the 2nd order 

element as master side discretization, (2) becomes next 

expression: 

10 

ab ke 

k 1 

A I A . 

(16) 

ke 

When the shape of master side element is a square or 

rectangle, the order of coordinates x and y can be defined 

TABLE II 

LOCAL COORDINATES FOR 2ND ORDER 

EDGE-BASED SHAPE FUNCTION ON MASTER SIDE 

k 1 2 3 4 5 6 7 8 9 10 

ke 0.5 -0.5 0.5 -0.5 1.0 1.0 -1.0 -1.0 0 0 

ke 1.0 1.0 -1.0 -1.0 0.5 -0.5 0.5 -0.5 0 0

as 1st order. Therefore, , , tab, dl/d are all same 

as (5), (6), (11), and (12). Substituting these equations 

and (15) for (8), all components of Ike are analytically 

obtained as follows: 

 

b 

a 

[ ke 

( b 

 

a ){ 4 

ke ( b 

 

a ) 

48 

 

 

ke ( b 

 

a )} 3{ 

2 

ke ( a 

 

b )} 

{ 4 

ke ( a 

b 

) 

ke ( a 

 

b )}] 

 

 

( 1 k 4) 

b 

 

a 

[ ke ( b 

 

a ){ 4 

ke ( b 

 

a ) 

48 

 

ke ( b 

 

a )} 3{ 

2 ke ( a b 

)} 

I ke 

. (17) 

 

{ 4 

ke ( a 

b 

) ke ( a b 

)}] 

 

( 5 k 8) 

 

 

b 

a 

2 

{ 12 3( 

a 

 

b ) ( b 

 

a )} 

24 

 

 

( k 9) 

b 

 

a 

2 

{ 12 3( 

a b 

) ( b 

 

a )} 

24 

 

( k 10) 

In the case of nonconforming connection between 1st 

order elements, 1st order connection (14) is adopted. 

Similarly, in the case between 2nd order elements, 2nd 

order connection (17) is adopted. When the 

nonconforming connection between same order elements 

is performed, the coarser side is adopted as the master 

side of the linear combination and the finer side is 

adopted as slave side. 

On the other hand, 1st order connection (14) is adopted 

as the connection order for mixed order analysis, in 

which 1st and 2nd order elements are connected. The 

reasons are follows: If the 2nd order connection (17) is 

applied to the interface for mixed order connection, the 

coefficients I9e and I10e to the edges on 1st order edge 1e- 

4e become zero. Therefore, the diagonal component 

related to the connection interface may be zero, and the 

difficulty for solving the algebraic equation causes. 

III. ANALYSIS MODEL 

Figure 2 shows a square coil model in order to verify 

the performance of various nonconforming connections. 

The current density is determined by the electric scalar 

potential to be I = 1000 AT in the current input surface. 

The meshes for the region including a square coil and the 

outer space are nonconforrmally connected. The 

nonconforming connection is performed at the three 

surfaces composed of 1st surface (x = 120, y: [0, 120], z: 

[0, 120]), 2nd surface (x: [0, 120], y = 120, z: [0, 120]), 

and 3rd surface (x: [0, 120], y: [0, 120], z = 120). The 

range of whole region is set to x: [0, 300], y: [0, 300], and 

z: [0, 300], and the all element shapes are the cube to 

remove the error caused by the element distortion. 

Figure 3 shows an open type MRI model [12], in which 

the main object is to compute the uniform magnetic flux 

distribution in the imaging region with high accuracy. 

The nonconforming connection is performed at the three 


surfaces composed of 1st surface (x = 400, y: [0, 400], z: 

[0, 400]), 2nd surface (x: [0, 400], y = 400, z: [0, 400]), 

and 3rd surface (x: [0, 400], y: [0, 400], z = 400). The 

magnetic nonlinearity of SS400 is considered in the yoke 

and pole piece. The remanence of two facing magnets is 

set to 1.2 T, and the nonlinear magnetostatic analysis is 

performed by NR method. 


A. Verification using square coil 

Figure 4 shows the two examples of finite element 

meshes. The element coefficient matrix is computed by 

Gaussian quadrature 3×3×3 points. The linear equation is 

stopped when the condition ||rk||2/||b||2 < cg is satisfied, 

where ||rk||2 and ||b||2 are 2-norm of the residual at the k-th 

iteration and right side vector in the algebraic equations 

and cg is set to 10 -6 . is the typical element size of 

standard model, and h is the element size of target model. 

Therefore, (h/) 2 of standard model becomes 1.0 as 

shown in Figure 4 (a). On the other hand, h in the 

nonconforming case (hexa-1st + hexa-1st, hexa-2nd + 

hexa-2nd, and hexa-2nd + hexa-1st) is defined as the 

element size of the inner mesh including magnetizing 

winding. 

Figure 5 shows the convergence characteristics of 

magnetic energy. All characteristics have an asymptotic 

behavior as h shortens. The W values in nonconforming 

case (hexa-1st + hexa-1st and hexa-2nd + hexa-2nd) at 

(h/) 2 = 1.0 is equivalent to those in conforming case. 

These nonconforming characteristics are slightly 

detached from conforming characteristics in the range 

(h/) 2 < 1.0. 

coil(I = 1000 AT) 

currentoutput 

y 

z 

Figure 2: Square coil model. 

y 

400 130 350 400 

imagingregion 

polepiece(SS400) 

z 

unit:[mm] 

directionofcurrent 

currentinput 

40 

x 

Figure 3: Open type MRI model. 

unit:[mm] 

magnet 

(Br = 1.2 T) 

yoke(SS400) 

x

Comparing the nonconforming characteristic (hexa-2nd + 

hexa-2nd) with (hexa-2nd + hexa-1st), the (hexa-2nd + 

hexa-2nd) characteristic is superior to the result of (hexa- 

2nd + hexa-1st) from the viewpoint of asymptote to the 

behavior of conforming hexa-2nd. 

TABLE III shows the effect of the size ratio on the 

computational accuracy. shows the ratio of the element 

size in outer space mesh for the element size in inner 

region, for example, becomes 1.5 in Figure 4 (b). In 

nonconforming case, the mesh for outer space is 

subdivided on the condition that the mesh for inner 

region is fixed. The number of elements for 2nd order is 

set to one eighth of 1st order element. When gets 

larger, the relative error of W tends to be worse owing to 

being coarse size of outer space element. The results of 

nonconforming connection have the tendency, in which 

the accuracy of the nonconforming connection using 2nd 

order element is superior to 1st order on the whole. The 

outerregion 

y 

currentoutput 

innerregion 

nonconf. 

boundary 

y 

outerregion 

currentoutput 

innerregion 

z 

(a) 

z 

x 

currentinput 

x 

currentinput 

(b) 

Figure 4: Finite element meshes of a square coil model. 

(a) conforming (h/) 2 = 1.0 and (b) nonconforming (h/) 2 

= 0.444. 


elapsed time using (nonconf. 2nd + 1st, = 2.0) is the 

shortest among the nonconforming results, in which the 

relative error of W is less than 0.1 %. Whereas the 

accuracy of W using (nonconf. 2nd + 2nd, = 2.0) is the 

best among the above mentioned nonconforming types, 

the elapsed time approximately quintuples against the 

case of (nonconf. 2nd + 1st, = 2.0). Hence, it is shown 

that the enough accuracy is provided by the mesh type 

(nonconf. 2nd + 1st, = 2.0) from the viewpoint of 

practical analysis. 

Figure 6 shows the z-component of magnetic flux 

density on z-axis. The all characteristics coincide with the 

standard characteristic, and the relative error between 

(nonconf. hexa-2nd + hexa-1st) and (conf. hexa-2nd) is 

less than 0.05 %. The accuracy of magnetic flux in local 

area as well as that of magnetic energy is retained even in 

W [J] 

nonconf.hexa2nd+hexa2nd 

nonconf.hexa2nd+hexa1st 

0.0512 

0.0508 

0.0504 

stand. 

conf.hexa2nd 

conf.hexa1st 

0.0500 

0.0 0 0.2 0.4 0.6 0.8 1.0 

(h /) 2 

nonconf. 

hexa1st+hexa1st 

Figure 5: Convergence characteristics of magnetic 

energy. 

nonconf.boundary 

0.012 12.0 

inner outer 

B z [mT] 

0.010 10.0 

0.008 8.0 

0.006 6.0 

0.004 4.0 

0.002 2.0 

0.00 0 0.04 40 0.08 80 0.12 120 

z [mm] 

TABLE III 

ANALYZED RESULTS OF SQUARE COIL MODEL 

nonconf.hexa2nd+hexa2nd 

nonconf.hexa1st+hexa1st 

conf.hexa2nd conf.hexa1st 

7.72 

B z [mT] 

7.70 

stand. 

nonconf. 

hexa2nd+hexa1st 

7.68 

59.8 60.0 60.2 

z [mm] 

Figure 6: Distributions of Bz on z-axis of square coil 

model. 

mesh type 

inner 

NoE 

outer total 

size ratio DoF nonzero 

time for 

global matrix [s] 

ICCG 

ite. 

time for 

ICCG [s] 

W [mJ] 

relative error (%) of W 

vs. conf. 2nd (stand.) 

conf. 2nd (stand.) 37,044 1,120,581 1,157,625 1.0 13,759,410 576,830,675 224.2 * 

466 1042.3 * 

 

51.0553 0 

conf. 1st 

1,672,704 1,728,000 1.0 5,126,520 86,040,332 67.5 147 67.1 50.9912 0.125 

nonconf. 1st + 1st 

55,296 

209,088 

26,136 

264,384 

81,432 

2.0 

4.0 

775,368 

236,424 

12,873,180 

3,901,596 

10.7 

3.5 

96 

59 

6.8 

1.4 

50.9840 

50.9548 

0.140 

0.197 

3,267 58,788 8.0 170,289 2,819,055 2.6 49 0.9 50.8334 0.435 

conf. 2nd 

209,088 216,000 1.0 2,548,920 105,755,060 51.4 263 142.4 51.0505 0.009 

nonconf. 2nd + 2nd 

26,136 

3,267 

33,048 

10,179 

2.0 

4.0 

383,256 

115,971 

15,585,392 

4,574,483 

8.2 

2.7 

141 

98 

12.5 

3.6 

51.0505 

51.0179 

0.009 

0.073 

6,912 1,672,704 1,679,615 0.5 5,037,828 86,450,712 67.7 214 96.7 51.0477 0.015 

nonconf. 2nd + 1st 

209,088 

26,136 

216,000 

33,048 

1.0 

2.0 

693,576 

154,632 

13,378,746 

4,398,882 

10.2 

2.9 

114 

84 

8.4 

2.3 

51.0408 

51.0117 

0.028 

0.085 

3,267 10,179 4.0 88,497 3,312,651 2.0 84 2.4 50.8903 0.323 

CPU: Intel Core i7-2620M 2.7 GHz & 16 GB 

CPU * : Intel Core i7-3930K 4.2 GHz with over-clocked & 32 GB

outer region 

imaging region 

(inner region) 

y 

outer region 



y 

outer region 

hexa-1st 



hexa-1st 

y 

outer region 

hexa-1st 



hexa-2nd 

y 

z 

r 

(a) 

z 

r 

(b) 

z 

r 

(c) 

z 

r 

(d) 

Figure 7: Finite element meshes of open type MRI 

model. (a) conforming (hexa-1st), (b) conforming (hexa- 

2nd isoparametric), (c) nonconforming (hexa-1st + hexa- 

1st), and (d) nonconforming (hexa-2nd + hexa-1st). 

B z [T] 

-0.20 

nonconf.boundary 

inner outer 

-0.22 

-0.24 

-0.26 

-0.28 

-0.30 

0.0 0.2 0.4 0.6 0.8 

r [m] 

x 

x 

x 

x 


the nonconforming connection. 

B. Application of mixed order finite element analysis to 

open type MRI model 

This subsection shows the effectiveness of mixed order 

finite element analysis with nonconforming connection in 

the open type MRI model shown in Figure 3. The 

convergence criterion cg of ICCG method is set to 10 -3 , 

and when the maximum correction of magnetic flux 

density is to be 10 -3 T, NR iteration is stopped. 

Figure 7 shows the finite element meshes for open type 

MRI model. Figure 7 (a), (b), (c), and (d) show the mesh 

for conforming hexa-1st, conforming hexa-2nd 

isoparametric, nonconforming (hexa-1st + hexa-1st), and 

nonconforming (hexa-2nd + hexa-2nd), respectively. The 

nonconforming mesh connection is performed at the 

interface between imaging region and other region in 

order to reduce the number of elements with keeping the 

accuracy of magnetic flux density in imaging region. 

Number of elements in conforming hexa-2nd (b) is one 

eighth size in conforming hexa-1st (a). The mesh for 

imaging region of (c) is exactly the same as that of (a), 

and the mesh for outer region of (c) and (d) is completely 

the same as that of (b). 

Figure 8 shows the distributions of z-direction 

magnetic flux density Bz on 45° direction r-axis which is 

located on the surface (x, y) = (0, 0) shown in Figure 7. 

There are some noise spikes in the characteristics of 

(conf. hexa-1st), (nonconf. hexa-1st + hexa-1st), and 

(nonconf. hexa-2nd + hexa-1st). The generation of noise 

is likely to be caused by the element distortion of 1st 

order hexahedral elements. The distributions of 1st order 

discretization have the concave and convex owing to the 

interpolation of inner flux using edge-shape function. 

The mixed order characteristic of (nonconf. hexa-2nd + 

hexa-1st) seems to be combined two properties, in which 

the property of 2nd order is confirmed in the inner region 

(imaged region) and 1st order property appeared in the 

outer region. 

TABLE IV shows the analysis results for open type 

MRI model. Even in MRI model, the DoF of (conf. 2nd) 

is a half of (conf. 1st); nevertheless, the elapsed time of 

(conf. 2nd) is longer than that of (conf. 1st). There is a 

possibility that the condition number of the global matrix 

B z [T] 

-0.26 

-0.27 

nonconf.hexa1st+hexa1st 

nonconf.hexa2nd+hexa1st 

conf.hexa2nd 

stand. 

conf.hexa1st 

-0.28 

0.50 0.62 

r [m] 

Figure 8: Distributions of Bz on r-axis in open type MRI model.

derived from 2nd order get worse than that of 1st order. 

On the other hand, the elapsed time of (nonconf. 2nd + 

1st) is the almost same as that of (nonconf. 1st + 1st). 

Furthermore, the iteration number for NR method in 

mixed order analysis is quite same as other mesh type. 

V. CONCLUSION 

We proposed a mixed order finite element method 

using nonconforming mesh connection technique. The 

obtained results are summarized as follows: 

1. We propose the nonconforming mesh connection 

between 2nd order elements supported by the linear 

combination, in which the coefficient for 2nd order 

connection can be derived from the line integral. 

2. The performances of the nonconforming connection 

for 2nd order elements and mixed order elements are 

verified using a square coil model. The accuracy of the 

nonconforming connection including the 2nd order 

discretization is superior to that of the meshes 

discretized by only 1st order element. 

3. The accuracy of mixed order analysis has the good 

agreement with that of conforming 2nd order 

discretized mesh and the mesh with 2nd order 

nonconforming connection. 

4. Mixed order analysis has superiority than the 

conventional conformal mesh from a point of view of 

the elapsed time. 

Finally, we will investigate the effectiveness of mixed 

order edge-based finite element analysis including the 

distorted finite elements as a future works. 


The authors would like to thank Mr. Y. Tominaga for 

helpful support. This work was supported by Japan 

Society for Promotion of Science (JSPS) Grant-in-Aid 

for Young Scientists (B) (Grant Number: 23760252). 

REFERENCES 

[1] M. Hano, T. Miyamura, and M. Hotta, “Fast and high-accuracy 

finite-element electromagnetic analysis by mixed-order vector 

elements,” The Papers of Joint Technical Meeting on Static 

Apparatus, SA-02-14, RM-02-14, pp. 13-18, Jan. 2002. (in 

Japanese) 

[2] P. Houston, I. Perugia, and D. Schötzau, “Nonconforming mixed 

finite-element approximations to time-harmonic eddy current 

problems,” IEEE Trans. Magn., Vol. 40, No. 2, pp. 1268-1273, Feb. 

2004. 

[3] D. Rodger, H. C. Lai, and P. J. Leonard, “Coupled elements for 

problems involving movement,” IEEE Trans. Magn., Vol. 26, No. 

2, pp. 548-550, Feb. 1990. 


TABLE IV 

ANALYSIS RESULTS OF OPEN TYPE MRI MODEL 

mesh type 

inner region 

NoE 

outer space total 

DoF nonzero NR ite. 

time for 

global matrix [s] 

total 

ICCG ite. 

time for 

ICCG [s] 

conf. 2nd (stand.) 

conf. 1st 8,000 

556,528 564,528 

6,680,172 

1,662,234 

278,902,794 

27,727,823 

 

7 

1035.0 

171.7 

2,254 

1,714 

2268.6 

262.6 

nonconf. 1st + 1st 

223,699 3,663,713 7 23.3 617 14.2 

conf. 2nd 

nonconf. 2nd + 1st 

1,000 

69,566 77,566 823,308 

212,099 

33,753,786 

3,709,345 

7 

7 

127.5 

23.3 

315 

624 

526.3 

14.9 

CPU: Intel Core i7-2620M 2.7 GHz & 16 GB 

[4] E. Lange, F. Henrotte, and K. Hameyer, “A variational 

formulation for nonconforming sliding interfaces in finite element 

analysis of electric machines,” IEEE Trans. Magn., Vol. 46, No. 8, 

pp. 2755-2758, Aug. 2010. 

[5] C. Golovanov, J.-L. Coulomb, Y. Marechal, and G. Meunier, “3D 

mesh connection techniques applied to movement simulation,” 

IEEE Trans. Magn., Vol. 28, No. 2, pp. 3359-3362, Feb. 1992. 

[6] H. Kometani, S. Sakabe, and A. Kameari, “3-D analysis of 

induction motor with skewed slots using regular coupling mesh,” 

IEEE Trans. Magn., Vol. 36, No. 4, pp. 1769-1773, Apr. 2000. 

[7] K. Muramatsu, Y. Yokoyama, N. Takahashi, A. Nafalski, and Ö. 

Göl, “Effect of continuity of potential on accuracy in magnetic field 

analysis using nonconforming mesh,” IEEE Trans. Magn., Vol. 36, 

No. 4, pp. 1578-1582, Apr. 2000. 

[8] Y. Okamoto, R. Himeno, K. Ushida, A. Ahagon, and K. Fujiwara, 

“Dielectric heating analysis method with accurate rotational motion 

of stirrer fan using nonconforming mesh connection,” IEEE Trans. 

Magn., Vol. 44, No. 6, pp. 806-809, Jun. 2008. 

[9] Y. Okamoto, K. Fujiwara, and R. Himeno, “Exact minimization of 

energy functional for NR method with line-search technique,” IEEE 

Trans. Magn., Vol. 45, No. 3, pp. 1288-1291, Mar. 2009. 

[10] K. Fujiwara, T. Nakata, and H. Fusayasu, “Acceleration of 

convergence characteristic of the ICCG method,” IEEE Trans. 

Magn., Vol. 29, No. 2, pp. 1958-1961, Mar. 1993. 

[11] A. Kameari, “Calculation of transient 3D eddy current using edgeelements,” 

IEEE Trans. Magn., Vol. 26, No. 2, pp. 466-469, Mar. 

1990. 

[12] C. Lee and K. Miyata, “Large-scale magnetic field analysis on 

MRI with hysteresis,” The papers of Joint Technical Meeting on 

Static Apparatus and Rotating Machinery, SA-06-20, RM-06-20, 

pp. 25-30, Jan. 2005. (in Japanese)


Topology Optimization Using Parallel Search 

Strategy for Magnetic Devices 

1 Takumi Nagano, 1 Shogo Yasukawa, 1 Shinji Wakao, and 2 Yoshihumi Okamoto 

1 Waseda University, 3-4-1, Okubo, Shinjuku, Tokyo 169-8555, Japan 

2 Utsunomiya University, 7-1-2, Yoto, Utsunomiya, Tochigi 321-8585, Japan 

E-mail: wakao@waseda.jp 

Abstract— In this paper, we propose a topology optimization method using parallel search strategy for magnetic devices. 

Here, we use the gradient method to minimize an object function from the viewpoint of convergence speed, i.e., steepest 

descent method. By applying parallel computing, we will carry out calculations simultaneously for some patterns of initial 

variables. With these calculation results, new patterns of initial variables are efficiently created for restarting new search 

processes, which results in the better topologies than previous ones. Compared with the conventional method, the proposed 

search method enables us to decrease the whole CPU time with keeping the optimization quality. 

Index terms— density method, parallel computing, topology optimization. 


Structural optimization is categorized into three types 

of problems, i.e., size optimization, shape optimization, 

and topology optimization. Topology optimization is a 

useful method especially in terms of weight saving. And 

there is possibility of discovering a new topology 

unthinkable by conventional methods. However, the 

computational load of topology optimization will be 

heavier than that of other optimization, because of the 

large search space. In this paper, we applied “density 

method with gradient method” to make the optimization 

process more efficient. In density method, topology of the 

target is expressed with the density value of elements 

which consist of the design domain [1]. And gradient 

method has good convergence speed as the search 

method. However, the result of minimization with 

gradient method will be mostly local minimum 

depending on initial variables. Therefore this paper 

proposes an efficient topology optimization method based 

on parallel computing for magnetic devices. In the 

proposed method, we can efficiently escape local 

minimums by simultaneous searches from various initial 

variables and creating new initial variables with the 

results of parallel optimization. 

II. PROPOSED METHOD 

A. Density method based on sensitivity analysis 

In density method, topology of the target is expressed 

with the density value of elements. In this paper, the 

target is magnetic circuit. The magnetic permeability i 

of ith element can be formulated as 

2 

{ 1 

1 

(1) 

}, 

i 0 r i 

where i is the density value of ith element, and r is 

relative permeability of the material. In this paper, r is 

1000, and the property of magnetic system is regarded as 

linear one. In density method, the number of elements in 

the design domain should be large to express the detail of 

topologies. When density method is combined with 

gradient method, the first derivative of object function 

with respect to density value (sensitivity) of each element 

in the design domain needs to be calculated. As the 

efficient sensitivity analysis, “adjoint variable method” is 

applied [2]. The algebraic equation of FEM can be 

formulated as 

HA G, 

(2) 

where H is whole coefficient matrix, A is unknown 

magnetic vector potential, and G is right side vector. 

(3) is obtained by differential calculus of (2) with respect 

to density vector . 

 

A 

G 

H 

H = A. 

(3) 

ρ 

ρ 

ρ 

(4) is obtained by multiplying H -1 to (3). 

 

A 1 

G 

H 

~ 

= H 

 

A 

 

, 

(4) 

i 

i 

i 

 

where A ~ is the solution of (2). 

Sensitivity of the object function W is obtained as 

T 

dW W W 

A 

= . (5) 

d 

 

A 

 

i 

i 

In this paper, W is defined with the value of magnetic 

flux density vector. Therefore, the 1 st term of (5) is 

invisible. By substituting (4) into (5), (6) is obtained. 

T 

W W 

1 

G 

H 

~ T G 

H 

~ 

H 

A 

λ 

A 

. 

(6) 

i 

A 

i 

i 

i 

i 

 

As the property of FEM matrix, H is symmetric. Taking 

account of this point, (7) is obtained by transforming (6). 

Hλ. 

A 

W 

(7) 

The is called “adjoint variable”. is obtained by 

solving (7). Finally, the whole sensitivity vector is 

obtained by substituting into (6). 

B. Basic concept of proposed method 

Here, we carried out parallel computing based on Open 

MP for topology optimization by using a computer with 8 

cores. The basic concept of proposed method is shown in 

figure 1. 

i

Figure 1: Basic concept of proposed method. 

The proposed method is based on gradient method, where 

the solution depends on initial values. Therefore, we 

regard the data set of initial topology, optimized 

topology, and object function value as the properties of 

one “individual”. 

First, we prepare 8 individuals i.e., the number of 

cores, by creating random initial topologies and 

computing their optimized topologies. Next, a new initial 

topology is created with property information of 2 

selected individuals. The process of creating a new 

topology with 2 individuals’ information is named 

“intercross”. We prepare new 8 initial topologies for the 

next generation, and obtain the data sets by optimizing 

new initial topologies. The better solution will be 

obtained by repeating the above cycle. 

In the next section, we explain how to create initial 

topologies in the next generation with 2 individuals. 

C. How to select 2 individuals for intercross 

Three manners of selection of 2 individuals for 

intercross are proposed. 

a) We combined 2 individuals with superior object 

function value to give properties of superior individuals 

to the next generation. In this paper, the combinations of 

individuals in the top 4 from the viewpoint of object 

function superiority are selected for intercross. We create 

3 initial topologies in the next generations based on this 

selection. 

b) We combined 2 individuals with different optimized 

topologies. The subject of the selection is to make 

diversity for intercross. The difference between 2 

topologies is evaluated in the following rule. 

As shown in figure 2, design domain is separated into 

some areas. Next, we define a reference value in the i th 

area with the density values as 

area 

i 

 

Ni 

 

2 

j 

j (8) 

. 

N 

Ni is the number of elements in the i th area. We define a 

vector C as in (9). 

i 

C area , area ,..., area ). (9) 

( 1 2 

8 


Figure 2: Separation of design domain. 

The vector C expresses the character of the optimized 

topology. The characteristic difference of 2 topologies A 

and B, is evaluated as (10). 

2 

B 

Diff C C . (10) 

AB 

The combination of 2 individuals with larger value of 

(10) is selected for intercross. We create 3 initial 

topologies in the next generations based on this selection. 

c) We combined 2 individuals chosen randomly. The 

subject of the selection is also to make diversity of 

intercross. We create 2 initial topologies in the next 

generations based on this selection. 

8 initial topologies in the next generation are created 

based on the selections a)-c). 

D. How to make new initial topologies 

Here, 2 individuals selected for intercross are named as 

IA and IB, and new individual is named as IC. The initial 

topology of IC is created in the following three kinds of 

manners, which are adopted randomly. 

a) The density values of IC’s initial topology are created 

as the weighted mean of those of initial topologies of IA 

and IB. 

b) The density values of IC’s initial topology are created 

as the weighted mean of those of optimized topologies of 

IA and IB. 

To inherit the properties of superior individuals to the 

next generation, the weight coefficients of IA and IB are 

formulated as 

, 

Ci 

Ai 

W 

 

W 

3 

A 

3 

3 

A WB 

, 

Bi 

A 

W 

 

W 

3 

B 

3 

3 

A WB 

(11) 

, 

where, WA and WB are object function values of IA and IB, 

and i stands for the element number. 

c) The density values of IC’s initial topology are created 

as the multiplication of initial topology of IA and 

optimized topology of IB. 

. (12) 

Ci 

Ai 

Bi

The distribution of density value i of 0 in optimized 

topology will be strongly inherited to the next generation. 

The probabilities of occurrence of a),b), and c) are 3/7, 

3/7, 1/7 respectively. 

III. NUMERICAL EXAMPLE 

A. C-shaped iron core model 

To demonstrate the validity of the proposed method, we 

carried out the optimization of the model shown in figure 

3. 

Figure 3: Magnetic force model. (unit : mm) 

The main subject of this optimization problem is to 

maximize the electromagnetic force generated in the 

magnetic bar lying in the right side of design domain. The 

density values of elements in the design domain are 

design variables of this optimization [3]. To evaluate the 

electromagnetic force, we adopt the following equation as 

the object function of this optimization, 

1 

2 

By 

W , (13) 

where, By is the y component of magnetic flux density 

vector generated in the target element. This model is 

discretized by triangular elements. The numbers of 

elements, in the design domain and whole domain, are 

1,575 and 5,576 respectively. 

Next, the detail of the proposed method is explained. 

Initial topologies of the 1 st generation are created as 

random values i which range from 0 to 1. The initial 

magnetic permeability distribution is determined as to 

(1). The best object function value before the n th 

generation is Wbest, and the best object function value in 

the n+1 th generation is Wn+1. If Wn+1 Wbest , we update 

the value of Wbest as Wn+1. If the update doesn’t occur over 

10 generations, the calculation is terminated. 


Figure 4: Flow chart of proposed method. 

Now, for the comparison with the proposed method, the 

optimization without intercross is also applied to this 

model. In the method, the initial topologies in whole 

generation are created as randomly as that of the 1 st 

generation. This method is named “random method”. 

2 pattern of initial topologies are prepared as those of in 

the 1 st generation, which are named as ITA and ITB. An 

example of initial topology in 1 st generation is shown in 

figure 5. 

Figure 5: Density distribution example of initial topology.

Wbest 

best object funciton value 

68 

67.5 

67 

66.5 

66 

65.5 

65 

64.5 

64 

63.5 

Initial 

pattern 

IT A 

IT B 

random method (ITA) proposed method 

random method (ITB) proposed method (ITB) 0 10 20 

generations 

30 40 

Figure 6: Convergence characteristic of each method. 

TABLE I 

THE OPTIMIZATION RESULTS OF BOTH METHODS. 

method 

total 

generations 

Wbest 

(IT A) 

(a) The obtained topology with random method 

from initial pattern ITA. 

CPU 

time(sec.) 

Random 20 65.0926 764.56 

Proposed 34 64.0327 347.91 

Random 25 64.2958 954.66 

Proposed 17 63.9976 229.98 


Figure 7: Comparison of CPU time. 

We carry out the optimizations with these topologies by 

the proposed method and the random method. The 

convergence characteristics of both methods are shown in 

figure 6. In random method, the improvement of object 

function value is obviously inefficient, and the 

computational result is in the local minimum. By 

contrast, the proposed method enables us to improve the 

object function value more efficiently, and to achieve 

better solution than that of the random method. The 

object function values Wbest of both methods are shown in 

table 1. 

(b) The obtained topology with proposed method 

from initial pattern ITA. 

(c) The obtained topology with random method 

(d) The obtained topology with proposed method 

from initial pattern ITB. Figure 8: Optimization result of each method. 

from initial pattern ITB. 

frequency 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0 

proposed method 

random method 

2 6 10 14 18 22 26 30 34 38 42 46 

calculation time[CPU sec.]

The frequency distribution of calculation time for each 

individual is shown in Figure 7. Figure 7 indicates that 

the calculation time for each individual is sufficiently 

reduced in the proposed method because the starting 

points for optimization, i.e., the initial topologies, are 

effectively created near stationary points. We can 

estimate the total CPU time by multiplying the 

calculation time for individuals in one generation by the 

required number of generations. As the results, the CPU 

time of the proposed method is much less than that of the 

random method in spite of their total generations required 

for convergence as shown in table I. 

The obtained topologies by both methods are shown in 

Figure 8. Black elements correspond to the magnetic 

material with i = 1, and white elements to air elements 

with i = 0. Gray elements have an intermediate property 

between air and magnetic material with 0 

are named as “gray scale”. The topologies obtained by 

the random method strongly depend on their initial 

variables and contain many gray scales. On the contrary, 

we can obtain the similar topologies without gray scales 

by using the proposed method. 

B. Magnetic shielding model 

Proposed method is applied to the optimization of the 

shield model shown in Figure 9. 

y 

x 

Figure 9: shield model. (unit : mm) 

The main subject of the optimization problem is to 

minimize the magnetic flux entering into target domain 

generated by 2 surrounding coils. The density values of 

elements in the design domain are design variables of the 

optimization. The object function is defined as (14). 

W 

2 

Bx 

 

t arg et 

domain 

2 

By 

. 

(14) 


This model is discretized by triangular elements. The 

number of elements, in the design domain and the whole 

domain, are 3,800 and 12,106 respectively. 

The optimization results of the random and the proposed 

methods are shown in Figures 10-13. 

Wbest 

best object funtion value 

ferquency 

0.0012 

0.001 

0.0008 

0.0006 

0.0004 

0.0002 

0.45 

0.4 

0.35 

0.3 

0.25 

0.2 

0.15 

0.1 

0.05 

0 

random method 


0 20 40 60 80 

generations 

Figure 10: Convergence characteristic of object function. 

0 

Wbest = 4.0610 -4 

0 

24 

48 

72 

96 

120 

144 

168 

192 

216 

240 

264 

288 

312 

336 

360 

384 

408 

432 

calculation time (sec.) 

Wbest = 6.9010 -5 

random method 


Figure 11: Comparison of CPU time. 

Fig 12: Result of optimization. (random method)

Figure 13: Result of optimization. (proposed method) 

The computational results demonstrate the effectiveness 

of the proposed method compared with the random 

method as is the case with the magnetic force model. 

We can successfully obtain the multilayer structure as 

an effective topology of the shield as shown in Figure 13. 


In this paper, a topology optimization using parallel 

search strategy for magnetic devices is proposed to 

efficiently obtain more global solutions. In the proposed 

method, we effectively introduce the novel concept of 

intercross into the density method with sensitivity 

analysis, which results in the CPU time reduction with 

keeping the optimization quality. 

We will investigate various algorithms of intercross, 

and apply the proposed method to more practical 

nonlinear problems as future works. 

REFERENCES 

[1] H. P. Mlejnek and R. Schirrmacher, "An engineer's approach to 

optimal material distribution and shape finding," Comput. 

Methods Appl. Mech. Eng., Vol. 106, pp. 1-26 (1993). 

[2] S.Gitosusastro, J.L.Coulomb and J.C. Sabonnadiere, "Performance 

derivative calculations and optimization process," IEEE Trans. 

Magn, Vol.25, No.4, pp. 2834-2839(1989) 

[3] Yoshihumi Okamoto, and Norio Takahashi, "Investigation of 

Topology Optimization of Magnetic Circuit by Using Density 

Method", IEEJ Trans. IA, Vol.124, No.12, pp. 1228-1235(2004). 

[4] Jin-kyu Byun, Il-han Park, and Song-yop Hahn, "Topology 

optimization of electrostatic actuator using design sensitivity," 

IEEE Trans. Magn. Vol.38, No. 2, pp. 1053-1056 (2002). 

[5] Jin-kyu Byun and Song-yop Hahn, "Application of topology 

optimization to electromagnetic system," International journal of 

applied electromagnetics and mechanics, Vol. 13, No. 1-4, pp. 25- 

33 (2002). 









Interaction Magnetic Force Calculation of Axial 

Passive Magnetic Bearing Using Magnetization 

Charges and Discretization Technique 

*Saša S. Ilić, *Ana N.Vučković and *Slavoljub Aleksić 

*University of Niš, Faculty of Electronic Engineering of Niš, Aleksandra Medvedeva 14, 18000 Niš, Serbia 

E-mail: ana.vuckovic@elfak.ni.ac.rs 

Abstract— The paper presents calculation of the force between two ring permanent magnets whose magnetization is axial. 

Such configuration corresponds to a passive magnetic bearing. The simple and fast analytical approach is used for this 

calculation based on magnetization charges and discretization technique. The results for interaction magnetic force obtained 

using proposed approach are compared with finite element method using FEMM 4.2 software. 

Index Terms— Permanent magnet, interaction magnetic force, magnetization charges, discretization technique, Finite 

Element Method (FEM). 

charges for uniform magnetization do not exist. nˆ is the 

unit vector normal to surface. 


Permanent magnets are used nowadays in many 

applications, and the general need for dimensioning and 

optimizing leads to the development of calculation 

methods. Permanent magnets are commonly used in 

many electrical devices and their own quality depends on 

the magnet material, magnetization and dimensions. Two 

major kinds of applications can be identified: the ones 

which use block magnets and the ones which use 

cylindrical magnets. Block permanent magnets are easy 

to manufacture and to magnetize, and it’s easier to 

calculate magnetic field they create [1-3]. Indeed, most 

engineering applications need several ring permanent 

magnets and the determination of the magnetic force 

between them is thus required. 

Magnetic bearings are contactless suspension devices 

with various rotating and translational applications [4]. 

Depending on the ring permanent magnet magnetization 

direction, the devices work as axial or radial bearings and 

thus control the position along an axis or the centering of 

an axis. Knowledge of the interaction magnetic force is 

required to control devices reliably. 

There are numerous techniques for analyzing permanent 

magnet devices and different approaches for determining 

interaction forces between them [5-8]. Many authors 

are proposing simplified and robust formulations of the 

interaction forces created by permanent magnets. The 

authors generally use the Ampere's current model [9],[10] 

or the Columbian approach [11],[12]. Several application 

examples were previously presented in [13],[14] where 

levitation forces for magnetic bearings were calculated. 

II. THEORETICAL BACKGROUND 

Axial passive magnetic bearing [5] that is considered in 

the paper is presented in the Fig.1. 

Since the boundary condition for surface magnetization 

charges density has to be satisfied for both magnets, 

m ˆ 

1 n M1 

and m ˆ 

2 n M2 

, (1) 

it is obvious that fictitious surface magnetization charges 

[2] exist only on the bottom and the top bases of each 

permanent magnet, because volume magnetization 

Figure 1: Axial passive magnetic bearing. 

The simplest procedure for interaction magnetic force 

determination is to discretize each base of both 

permanent magnets into system of circular loops. The 

interaction force between two magnetized circular loops 

will be calculated first. That will be performed by 

calculating the magnetic field and magnetic flux density 

generated by the arbitrary magnetized circular loop of the 

upper magnet first and then the force that acts on the 

arbitrary loop of the lower magnet (Figure 2). Magnetic 

field of the upper loop will be determined by calculating 

the magnetic scalar potential. Using results for interaction 

magnetic force between two circular loops, magnetic 

force of the axial magnetic bearing can be obtained by 

summing the contribution of both magnet bases of lower 

and upper permanent magnets by using uniform 

discretization technique. 

The goal of this approach is to determine the interaction 

magnetic force between two circular loops uniformly 

loaded with magnetization charges Qm1 and Q m2 

. 

Dimensions and positions of the loops are presented in 

the Figure 2. For determining the interaction force 

between two circular loops, magnetic scalar potential, 

magnetic field and magnetic flux density generated by 

the upper loop will be calculated. Elementary magnetic 

scalar potential generated by the elementary point 

magnetization charge, dQ m1,is

Figure2: Two circular loops. 

dQm1 

1 

d m . (2) 

4 

R 

Qm1 

Qm1 

Since dQ 

m1 

Qm 

1 dl 

a d ' 

d ' 

, 

2a 

2 

elementary magnetic scalar potential has the following 

form 

Qm1 

1 

d m 

d ' 

, (3) 

2 

8 

R 

and the resulting magnetic scalar potential generated by 

the upper circular loop at an arbitrary point P( r, 

, 

z) 

is 

2 

Qm1 

1 

m 

d ' 

, 

2 

8 

2 2 

2 

0 r r0 

zz0 2r0r 

cos' 

(4) 

Considering the existing symmetry, in 0 plane, 

magnetic scalar potential has the following form 

 

Qm1 

1 

m ( r, 

z) 

 

d '. 

2 

4 

2 2 

2 

0 r r0 

zz0 2rr0 

cos ' 

(5) 

Substituting θ' 2 

in Eq. (5), magnetic scalar 

potential is obtained as: 

2 

Qm1 

m ( r, 

z) 

 

2 

2 

 

0 

1 

2 

2 

2 

( r r0 

) 4rr0 

sin zz0 d 

. (6) 

After some simple operations the magnetic scalar 

potential can be given in the form: 

where 

m 

( r, 

z) 

 

Qm1 

2 

2 

2 

 

 

K 

, k 

2 

2 

( r r0 

) 

2 z z 

 

1 

K , k K 

, 

2 

2 2 

1 

k sin 

0 

0 

d 

 


(7) 

is complete elliptic integral of the first kind with modulus 

2 4rr0 

k . 

2 

2 

( r r0 

) ( z z0 

) 

External magnetic field (magnetic field generated by the 

upper loop) at an arbitrary point can be determined as 

ext 

H ( r, z) 

grad 

m 

( r, 

z) 

Hr 

( r, 

z) 

rˆ 

H z ( r, 

z) 

zˆ 

, 

(8) 

External magnetic flux density is 

ext 

ext 

ext 

ext 

B ( r, z) 

H ( r, 

z) 

, (9) 

ext 

r 

ext 

r 

ext 

z 

with components 

ext 

Br 

 

 

 

2 

 

r 

 

 

2r 

and 

0 

B ( , z) 

B ( r, 

z) 

rˆ 

B ( r, 

z) 

zˆ 

. (10) 

( r, 

z) 

 

ext 

ext H 

( r, 

z) 

Br ( r, 

z) 

0 

, (11) 

r 

2 2 

2 

rr0zz0 E, 

k 

2 

2 

2 

2 

( r r ) zz ( r r ) zz 0 

( r r 

 

K 

, k 

2 

0 

) 

B z 

0 

2 

Q 

m1 

2 

2 

 

 

0 

zz 0 

2 

 

 

 

 

, 

 

 

 

0 

ext 

0 

2 

 

(12) 

ext H 

( r, 

z) 

( r, 

z) 

0 

, (13) 

z 

ext Qm1 

Bz ( r, 

z) 

0 

 

2 

2 

 

zz0E, k 

2 

2 2 

2 

2 

( r r ) z z ( r r ) z z 

0 

0 

2 

 

0 

0 

(14) 

 

2 2 

where E , k E 1 

k sin d , 

2 

0 

is complete elliptic integral of the second kind with 

modulus 

2 4rr0 

k 

2 

2 

( r r0 

) ( z z0 

) 

. 

The interaction magnetic force on elementary 

magnetization charge of lower circular loop 

Q 

m2 

Qm2 

dQm 

2 Qm2 

dl 

bd 

d 

is 

2b 

2 

ext 

d m2 

m m 

F dQ 

B ( r , z ) . (15) 

Finally, interaction magnetic force components can be 

expressed as:

Qm1Qm 

2 

Fr ( r, 

z) 

0 

 

2 

2 

 

 

 

 

2 

 

rm 

 

m 0 m 0 m 0 

 

 

K 

, k0 

 

2 

2rm 

2 

( rm 

r0 

) m 

2 2 

2 

rmr0zmz0 E, 

k0 

 

2 

2 

2 

2 

( r r ) zz ( r r ) zz Qm1Qm 

2 

Fz ( r, 

z) 

0 

 

2 

2 

 

zmz0E, k0 

 

2 

m 

zz 0 

2 

 

 

 

0 

2 

0 

 

 

2 2 

2 

2 

( rm 

r0 

) zm 

z0 

( rm 

r0 

) zm 

z0 

(16) 


(17) 

Q 

( , ) m1Q 

F m 2 

z r z 0 

F ( r0, 

rm, 

z0, 

zm) 

2 z 

, (18) 

p 

2 

with elliptic integrals modulus 

2 4r0rm 

k0 

2 

2 

( rm 

r0 

) ( zi 

zm 

) 

. 

The axial component of the force (17) presents 

interaction force between two magnetized circular loops. 

The simplest procedure for levitation magnetic force 

determination is to discretize each bases of permanent 

magnets into system of circular loops, where N 1 is the 

number of discretized segments of each bases of upper 

permanent magnet and N 2 is the number of discretized 

segments of each bases of lower permanent magnet. 

Figure 3: Discretizing model. 

By taking into account the ring geometry of permanent 

magnets (Figure 3), the radius of each discretized 

segment of both bases of upper magnet is 

r n 

2n 

1 

a ( b a), 

n 1, 

2, 

, 

N1 

2N1 

, (19) 

and magnetization loop charges of upper permanent 

magnet bases are 

b a 

Qm n M12rn 

, n 1, 

2,..., 

N1 

N1 

. (20) 

For lower magnet bases the radius of each discretized 

segments is 

r i 

2i 

1 

C ( d c), 

i 1, 

2, 

, 

N2 

2N 

2 

. (21) 

Magnetization loop charges of lower permanent magnet 

bases are 

d c 

Qm i M 2 2ri , i 1, 

2,..., 

N2 

N2 

. (22) 

Using results for interaction magnetic force between 

two circular loops, Eqs. (17), the levitation magnetic 

force between two ring permanent magnets can be 

obtained. It can be achieved by summing the contribution 

of both magnet bases of lower and upper permanent 

magnets by using uniform discretization technique, 

 

 

1 2 

20 

M1M 

2 

Fz 

( b a)( 

d c) 

rnri 

 

N N 

 

F 

F 

zp 

zp 

1 

2 

( r , r , h, 

0) 

F 

n 

( r , r , h L , 0) 

F 

n 

i 

i 

1 

Lh zp 

n1 

i1 

( r , r , h, 

L ) 

n 

zp 

 

E 

, k3 

 

2 

i 

N 

2 

N 

( rn 

, ri 

, h L1, 

L2 

) 

; (23) 

2 

0M 

1M 

2 

Fz 

 

( b a)( 

d c) 

 

N1N 

2 

 

 

N1 

N2 

hE 

, k1 

 

 

2 

rn 

ri 

 

 

2 2 

2 2 

n1 

i1 

 

( ri 

rn 

) h ( ri 

rn 

) h 

 

 

L2hE, k2 

 

2 

 

2 

2 

2 

2 

( ri 

rn 

) L2h ( ri 

rn 

) L2h 2 

2 

2 

( r r ) Lh ( r r ) Lh i 

1 

LLh 

E 

, k4 

 

2 

2 

2 

2 

( r r ) LLh ( r r ) LLh i 

n 

n 

where 

2 4rnri 

k1 

, 

2 2 

( ri 

rn 

) h 

2 2 4rnri 

k2 

, 

2 

ri 

rn 

L2 

h 

2 2 4rnri 

k3 

, 

2 

ri 

rn 

L1 

h 

2 2 

4rnri 

k4 

. 

2 

ri 

rn 

L2 

L1 

h 

2 

1 

2 

1 

1 

i 

n 

i 

n 

1 

2 

2 

1 

 

 

 

 

2 

 

 

 

(24) 

III. NUMERICAL RESULTS 

We are working under presumption that the both ring 

permanent magnets are made of the same material and 

magnetized uniformly along their axis of symmetry, but 

in opposite direction, M1 M 2 M .

Distribution of magnetic flux density obtained using 

FEMM 4.2 software [15] is presented in Fig. 6. The 

values of the geometrical parameters used in the 

numerical computation are: 2 1, 

L a , 2 2 L b 

c L2 

3, 

d / L2 

4, 

L1 

/ L2 

0. 

5, 

2 1. 

5, 

L h 

2 1mm 

L and kA/m 900 M . 

Convergence of the normalized interaction force, 

nor Fz 

Fz 

obtained using presented approach is 

2 2 

0M 

L2 

, 

given in Table I for magnetic bearing dimensions: 

a L2 

1, b L2 

2, 

2 3, 

L c , 4 2 L d , 5 . 0 / L1 

L2 

 

h / L2 

0. 

1 . 

Figure 4: Distribution of magnetic flux density for magnetic bearing 

obtained using FEMM 4.2 software. 

TABLE I 

CONVERGENCE OF LEVITATION MAGNETIC FORCE FORCE VERSUS 

NUMBER OF SEGMENTS. 

N tot 

nor 

F z 

nor 

F z (FEM) 

10 -0.0732327 

20 -0.0741579 

30 -0.0743319 

50 

100 

-0.0744213 

-0.0744591 

-0.07491167 

200 -0.0744685 

300 -0.0744703 

500 -0.0744712 

In order to save the calculation time, the number of 

segments is limited on N tot N1 

N2 

200 because it 

is not necessary to take a greater number of segments to 

obtain a desired accuracy. 

Compared results for normalized interaction magnetic 

force of two identical ring permanent magnets, obtained 

using presented analytical approach and finite element 

method (FEM) versus 2 L h , for parameters: , 1 2 L a 

2 2, 

L b c L2 

3, d L2 

4 and 5 . 0 / L 1 L2 

are 

given in the Table II. 

Comparative results for normalized interaction 

magnetic force of axial passive magnetic bearing versus 

ratios 2 L a and 2 L b , obtained using presented approach 

and finite element method (FEM), for parameters: 

c L2 

3, d L2 

4, 

L1 

/ L2 

0. 

5 and h / L2 

1. 

5 are 

shown in the Table III. 


TABLE II 

COMPARED RESULTS FOR INTERACTION MAGNETIC VERSUS h L2 

h / L2 

nor 

F z 

nor 

F z (FEM) 

0 -0.120071 -0.120515 

0.1 -0.074469 -0.074911 

0.2 -0.025238 -0.025659 

0.3 0.025238 0.024873 

0.4 0.074469 0.074099 

0.5 0.120071 0.119765 

0.6 0.159974 0.159691 

0.7 0.192597 0.192327 

0.8 0.216978 0.216771 

0.9 0.232821 0.232627 

1.0 0.240465 0.240293 

1.1 0.240766 0.240654 

1.2 0.234930 0.234829 

1.3 0.224326 0.224240 

1.4 0.210322 0.210279 

1.5 0.194163 0.194138 

TABLE III 

COMPARED RESULTS FOR INTERACTION MAGNETIC FORCE VERSUS 

a L AND 

2 b L2 

a / L2 

b/ 

L2 

nor 

F z 

nor 

F z (FEM) 

1.0 2.0 0.194163 0.194138 

1.5 2.5 0.320992 0.321208 

2.0 3.0 0.209864 0.210493 

2.5 3.5 -0.614301 -0.613222 

3.0 4.0 -1.341280 -1.339868 

3.5 4.5 -0.714116 -0.712579 

4.0 5.0 0.272002 0.273619 

4.5 5.5 0.491236 0.492676 

5.0 6.0 0.352195 0.353756 


Determination of the interaction forces of axial passive 

magnetic bearing is presented. It is preformed using 

magnetization charges and discretization technique. 

Presumption was that both magnets are made of the 

same material and magnetized uniformly along the 

magnet axis of symmetry, with the same intensity, but in 

opposite directions. The derived algorithm is easily 

implemented in any standard computer environment and 

it enables rapid parametric studies of the interaction 

force. The results of the presented approach are 

successfully confirmed using FEMM 4.2 software. Table 

I shows that it is not necessary to take a great number of 

segments (not more then 200) to obtain a desired 

accuracy so the computational time can be saved. 

Interaction forces calculation using presented approach 

for mentioned parameters and N tot 200 is performed 

with Intel Core 2 Duo CPU at 2.4GHz and 4GB RAM 

memory and it took less than two seconds of run time. 

Interaction forces are also determined on the same 

computer using FEMM 4.2 software and the computation 

time was 14 minutes for about 1.8million finite elements. 

Therefore, the advantage of presented analytical approach 

is its accuracy, simplicity and time efficiency.

V. ACKNOWLEDGEMENT 

The work presented here was partly supported by the 

Serbian Ministry of Education and Science in the frame 

of the project TR 33008. 

REFERENCES 

[1] J. S Agashe and D. P Arnold, “A study of scaling and geometry 

effects on the forces between cuboidal and cylindrical magnets 

using analytical force solutions”, J. Phys. D: Appl. Phys. 41 

105001, pp.1-9, 2008. 

[2] A. N. Vučković, S. R. Aleksić, S. S. Ilić.: “Calculation of the 

Attraction and Levitation Forces Using Magnetization Charges”, 

The 10th International Conference on Applied Electromagnetics – 

PES 2011, Proceedings of full papers (CDROM), pp. 33-55, 25- 

29 September, Niš, Serbia, 2011. 

[3] G. Akoun, J. P. Yonnet.: “3d Analytical Calculation of the Forces 

Exerted between two Cuboidal Magnets”, IEEE Transactions on 

Magnetics, Vol. 20, No. 5, pp. 1962-1964, September 1984. 

[4] S. I. Babic, C. Akyel.: “Magnetic Force Calculation between Thin 

Coaxial Circular Coils in Air”, IEEE Transactions on Magnetics, 

Vol. 44, No. 4, pp. 445-452, April 2008. 

[5] V. Lemarquand, G. Lemarquand.: “Passive Permanent Magnet 

Bearings for Rotating Shaft: Analytical Calculation”, Magnetic 

Bearings, Theory and Applications, Sciyo Published book, pp. 85- 

116, October 2010. 

[6] R. Ravaud, G. Lemarquand, V. Lemarquand.: “Force and 

Stiffness of Passive Magnetic Bearings Using Permanent Magnets. 

Part 1: Axial Magnetization”, IEEE Transactions on Magnetics, 

Vol. 45, No. 7, pp. 2996-3002, July 2009. 

[7] R. Ravaud, G. Lemarquand, S. Babic, V. Lemarquand, C. Akeyel.: 

“Cylindrical Magnets and Coils: Fields, Forces and Inductances”, 

IEEE Transactions on Magnetics, Vol. 46, No. 9, pp. 3585-3590, 



[8] M. Greconici, Z. Ž. Cvetković, A. N. Mladenović, S. R. Aleksić, 

D. Vesa.: “Analytical-numerical Approach for Levitation Force 

Calculation of a Cylindrical Bearing with Permanent Magnets 

Used in an Electric Meter” Proceedings of full papers OPTIM 

2010, pp. 197-201, 20-21 May, Brasov, Romania, 2010. 

[9] Furlani, E. P., S. Reznik, & A. Kroll. 1995. A three-dimensional 

field solution for radially polarized cylinders. IEEE Trans. Magn., 

vol. 31, no.1, pp. 844–851. 

[10] M. Braneshi, O. Zavalani and A. Pijetri.: “The Use of Calculating 

Function for the Evaluation of Axial Force between Two Coaxial 

Disk Coils”, 3 rd International PhD Seminar Computational 

Electromagnetics and Technical Application, pp. 21-30, 28 

August - 1 September, Banja Luka, Bosnia and Hertzegovina, 

2006. 

[11] Rakotoarison, H. L., J.-P. Yonnet, & B. Delinchant.2007. Using 

Coulombian Approach for Modeling Scalar Potential and 

Magnetic Field of a Permanent Magnet With Radial Polarization. 

IEEE Transactions on Magnetics, Vol. 43, No. 4, pp. 1261-1264. 

[12] R. Ravaud, G. Lemarquand, V. Lemarquand.: “Force and 

Stiffness of Passive Magnetic Bearings Using Permanent Magnets. 

Part 2: Radial Magnetization”, IEEE Transactions on Magnetics, 

Vol. 45, No. 9, pp. 3334-3342, September 2009. 

[13] Ana N. Vučković, Saša S. Ilić & Slavoljub R. Aleksić: Interaction 

Magnetic Force Calculation of Ring Permanent Magnets Using 

Ampere's Microscopic Surface Currents and Discretization 

Technique, Electromagnetics, 32:2, pp. 117-134, 2012. 

[14] A. N. Mladenović, S. R. Aleksić, S. S. Ilić.: “Levitation Force 

Calculation for Permanent Magnet Bearings Using Ampere’s 

Currents”, The 14 th International IGTE Symposium on Numerical 

Field Calculation in Electrical Engineering, Proceedings of full 

papers (CDROM), pp. 149-153, 19-22 September, Graz, Austria, 

2010 

[15] Meeker, D. n.d. Software package FEMM 4.2. Available on-line 

at http://www.femm.info/wiki/ Download (accessed 2 March 

2007).


Magnet deviation measurements and 

their consideration in 

electromagnetic field simulation 

Peter Offermann ∗ , Isabel Coenen ∗ , David Franck ∗ and Kay Hameyer ∗ 

∗ Institute of Electrical Machines 

RWTH Aachen University 

Schinkelstrasse 4 

D-52062 Aachen, Germany 

E-mail: Peter.Offermann@IEM.rwth-aachen.de 

Abstract—Due to their manufacturing process arc segment magnets for the use in permanent-magnet synchronous machines 

(PMSM) may show deviations from their intended ideal magnetization. Using magnets with unfavourable error constellations 

in one rotor of a PMSM will result in a spatial unsymmetric air gap field, causing undesired parasitic effects as e.g. torque 

pulsations. Most manufacturer information only contain the mean values of the magnetization as well as certain guaranteed 

error bounds, not stating if (and how) the magnetization will vary spatial over a set of magnets. In order to allow an 

accurate consideration of these deviations in the machine simulation, the emitted radial field of a set of magnets has been 

measured and compared to their assumed magnetisation using finite element method (FEM). As a result, the measured 

deviations can be quantified and the influence of magnet deviations can be estimated using e.g. stochastic collocation 

methods in combination with the FEM. 

Index Terms—finite element method, magnetization errors, measurements, stochastics variations 


The simulation of an electrical machine employing 

the finite element method (FEM) requires the exact 

knowledge of the machine’s geometry, its excitations and 

its material properties. For machines which are manufactured 

in mass production, the material or geometry of 

one specific instance of the designed machine may vary 

from its specified targets [1], leading in the worst case 

to a non-fulfilment of the rated machine’s data. 

For geometry variations a typical cause is the abrasion 

of the punching tools. Varying material properties 

may be caused e.g. by a stochastic jitter in the orientation 

of the punched stator lamination sheets, which 

can be tainted with anisotropy. Causes for variations 

in excitations can either arise from the converter or – 

in case of a permanent-magnet synchronous machines 

(PMSM) – from magnet deviations [2] with respect to 

their intended ideal magnetization [3]. Using magnets 

with unfavourable error constellations in one rotor of a 

PMSM will result in a spatial unsymmetric air gap field, 

causing undesired parasitic effects as torque pulsation [4], 

[5]. 

Most manufacturer information only contain the mean 

values of the magnetization as well as certain guaranteed 

error bounds, not stating if (and how) the magnetization 

will vary spatial over a set of magnets. The goal of 

this publication hence is to improve the simulation of 

electrical machines by reducing the described epistemic 

uncertainty of magnet variations. Therefore, a magnet 

test-bench has been created, in order to measure the 

emitted radial field of a set of magnets. From this, the 

modality and probability distribution of the occurring 

variations have been deduced. 

The comparison of the magnets’ FEM-simulations 

with their measurements may allow the calculation of 

improved simulation parameters for complete machine 

simulations. For the measured magnets, which were 

diametrally magnetized, three error-types have been identified: 

A general variation of the flux-density’s strength 

of up to 11.6%, a maximal local, angle deviation at the 

magnet’s outer borders of 8 ◦ and local errors of up to 

9.1%. 

II. MAGNETIZATION MEASUREMENT TEST-BENCH 

In order to obtain reliable data about possible magnetisation 

errors, a test bench for the evaluation of surface 

magnets has been built. In the following the sensor 

selection (sec. II-A) and the test-bench construction (sec. 

II-B) are described. 

A. Sensor selection 

Typical methods to measure the magnetic flux-density 

are Hall-sensors and Helmholtz-coils. In this paper, a 

Hall-sensor as depicted in fig. 1 has been selected, due 

to the following reasoning: 

For best results, both methods require that the measured 

magnetic field is oriented perpendicular to the 

measuring coil respectively Hall-sensor. This can be 

easier accomplished for larger sensors than for very small

devices. Hall-sensors can be miniaturized due to the fact 

that an interaction with a given current is measured. 

Therefore the concomitant reduction of the Hall-constant 

CH, being a consequence of a reduction in material 

volume, can be compensated to certain extents with an 

increase in the measurement current (fig. 1). This allows 

to measure field components nearly pointwise. 

d 

ϕ1 

I 

B 

ϕ2 

Fig. 1. Hall-sensor and its distinctive input sizes. 

Helmholtz-coil configurations – in contrast to Hallsensors 

– always measure the the overall magnetic fluxdensity. 

Due to this integration over the magnet’s surface 

flux-density, however, a pointwise selective resolution of 

the magnetic field is no longer possible. Global angle 

offsets in the magnetization can be detected with both 

measurement methods by either using multiple sensors 

respectively coils or by turning the magnet under test. 

For this purpose, coils are preferable, because their 

orientation is better adjustable and an integration over 

all local values for a single angle value is implemented 

intrinsic in the coil. Local angle errors however cannot 

be detected using such a setup. Lastly, coil measurements 

are less noise sensitive because the integration already 

smoothes some measurement noise. 

The decisive factor for Hall-sensors was the interest 

in local magnet variations, since most publications until 

now focus only on global magnet variations [6], [7] in 

electrical machines. Furthermore, this selection allows 

the analysis of possible locational misalignments of the 

magnets and will enable a later use of the measured 

variations in conformal mapping Ansatz functions [8], 

[9]. 

B. Test-bench construction 

For the construction of the magnet test bench, Hallsensors 

of the type HE-244 [10] were selected. Table II-B 

summarizes the main features of the selected sensor: 

TABLE I 

PROPERTIES OF THE USED HALL SENSOR. 

value unit 

supply current up to 10 mA 

sensitivity 90 to 190 V / (A · T) 

linearity 

hall voltage typical ≤ 0.2 % 

Three sensors for the measurement of the magnetic 

field components Bx, By and Bz are located on an index 


arm with predefined 90 degree edges, in order to achieve 

a good positioning. The sensors are positioned directly 

on adjacent edges to measure the field at approximately 

one point as depicted in fig. 2. 

y x 

z 

Fig. 2. Positions and labelling of the used Hall-sensors on the 

measurement anchor. 

The index arm itself is mounted on a gibbet, which 

is constructed in such a way, that it allows a position 

adjustment in all three dimensions. Below the index arm 

the magnets under test can be mounted upon a cylindric 

shaft which rotates around its symmetry-axis (fig. 3, 4). 

z 

encoder 

y 

x 

step motor 

magnet mounting 

rotation axis 

hall sensor 

magnet under test 

Fig. 3. Schematic scetch of the created test bench for magnet 

measurements. 

This allows the use of a connected stepper-motor to 

measure the field along a circular line over the magnet’s 

surface. To avoid field distortion by flux guidance all 

relevant test bench components have been constructed 

from aluminium. Data acquisition and the stepper-motor 

control are implemented using a dSpace-system in combination 

with a PC. 

III. RESULTS 

In this study 52 magnets with diametral magnetization 

and a field strength of Br = 1.04T were analysed, 

consisting of two equally sized groups with either northor 

south-pole on the outer magnet circumference. For 

each magnet, the Hall-voltage of the radial outwards 

pointing flux-density was measured 1.5mm above the 

magnet’s surface. The magnet’s dimensions are given in 

fig. 5. 

A. Simulations 

In the simulations, the magnet (as depicted in fig. 5) 

is surrounded by an air layer which measures ten times

Fig. 4. Photograph of the constructed magnet test bench. 

Br =1.04T 

3mm 

Fig. 5. Dimensions of the measured magnets. 

15mm 

the magnet’s height in every direction [11]. The applied 

solver implements the magnetic vector-potential formulation. 

All boundaries were set as Neumann conditions. The 

radial flux-density was sampled along a circumference of 

1.5mm above the magnet. 

B. Measurements 

1) Repetition measurements: 

Repetitive measurements were executed to determine the 

test-bench’s measurement reproducibility. The average 

error between two arbitrary measurements of the same 

magnet is below 0.5% and mainly caused by very small 

positioning errors of the magnet in the tangential direction 

of the measurement shaft. Fig. 6 depicts five 

repetitive measurements of magnet #7. 

2) Post-processing of measurements: 

For data acquisition, every magnet is inserted, measured, 

and removed from the test-bench five times (fig. 6). 

Afterwards, the repetitive data of each magnet data are 

scanned for obvious misplacement errors. If they exist, 

the worst deviating measurement is removed. Thereafter, 


V(Brad)[V ] 

2 

0 

−2 

−4 

−6 

200 220 240 260 280 300 320 340 

angle [ ◦ −8 

] 

Fig. 6. Five repetitive measurements of magnet #7, showing the testbench’s 

reproduction quality. 

the repetitive measurements are aligned to have their 

outer minima centred at around fixed value. Ultimately, 

the remaining, centred flux-density values of the magnet 

are averaged. Fig. 7 shows – for the purpose of demonstration 

exaggerated – examples of the described process. 

raw measurements 

delete errors 

x-align measurements 

average 

Fig. 7. Post-processing of measured flux-density curves. 

3) Variation measurements: 

Figure 8 presents the results of the variation measure-

ments for all magnets which have their north pole located 

on the outer side. Two obvious variations can be directly 

identified: 

• Strength variations in the overall remanence fluxdensity 

per magnet, 

• Strong deformations from the expected curve shape 

in terms of local variations. 

V(Brad)[V ] 

8 

6 

4 

2 

0 

−2 

200 220 240 260 280 300 320 340 

angle [ ◦ ] 

Fig. 8. Measured radial flux-density 1.5mm above each magnet’s 

centre in the magnet group ’north-up’. 

Fig. 9 shows accordingly the likelihood of occurrence 

for the radial outwards pointing flux-density over the 

magnet angle for the opposite magnet group. Due to the 

envelope shape of the resulting curve, the strong influence 

of the variations is even more obvious. 

Fig. 9. Probability of measured magnetisation strength, probabilities 

ranging from low (dark) to high (light). 

C. Comparison of measurements and simulations 

In order to quantify the strength of the occurring 

deviations in terms of changes in excitation (in contrast 

to changes in the resulting flux-density), the excitation 

of each magnet had to be reconstructed from the given 


measurements. To solve this inverse problem [12], a 

straightforward approach was to compare the measured 

radial flux-density component of each magnet to a set 

of simulations. In these simulations, the magnet’s remanence 

flux-density Br was varied as parameter ξ1, 

applying the simulation conditions presented in section 

III-A. However, the resulting shapes did not agree to 

the measured curves. The employed magnetisation model 

was therefore extended to include a second deviation 

parameter ξ2, allowing an angle spread in magnetisation 

as given in fig. 10 and yealding the excitation given in 

eq. 1: 

⎛ 

B(Δα, ξ1,ξ2) =Br(ξ1) · ⎝ cos(αmid 

⎞ 

+Δα(ξ2)) 

sin(αmid +Δα(ξ2)) ⎠ (1) 

0 

Δα 

Fig. 10. Determined second deviation parameter ξ2 (grey) from the 

ideal, unidirectional magnetisation. 

Applying both variation types, the magnet excitation 

parameters could be reconstructed sufficiently in most 

cases using the least-square minimization from eq. 2 for 

parameter determination: 

 

 

 

min 

 

ξ1,ξ2 

310 ◦ 

 

α=230 ◦ 

[Brad,sim(α, ξ1,ξ2) − Brad,mes(α)] 2 

 

 

 

 

(2) 

Fig. 11 shows the comparison of the measured radial 

flux-density (dashed) in comparison to the best fitting 

simulated curve (solid). The divergence of both curves at 

V(Brad)[V ] 

8 

6 

4 

2 

0 

−2 

200 220 240 260 280 300 320 

angle [ ◦ −4 

] 

Fig. 11. Measured (dashed) radial outwards pointing flux-density in 

comparison to its best fitting siumlation for magnet #1.

the outer side of both graphs can safely be neglected here, 

because they are caused by effects of the 2D-simulation 

and are considered as not relevant, as this area is not 

above, but beside the magnet. 

Figure 12 finally shows the comparison of measured 

and simulated radial outwards pointing flux-density for a 

magnet having a local magnetisation error. As the graph 

clearly shows, this behaviour cannot be reproduced by the 

applied model yet. The three identified error-types finally 

have been identified to: flux-density’s strength variations 

of up to 11.6%, a maximal local, angle deviation at the 

magnet’s outer borders of 8 ◦ and local errors of up to 

9.1% 

V(Brad)[V ] 

8 

6 

4 

2 

0 

−2 

200 220 240 260 280 300 320 

angle [ ◦ −4 

] 

Fig. 12. Measured (dashed) radial outwards pointing flux-density in 

comparison to its best fitting siumlation for magnet #13. Local errors 

cannot be reproduced yet. 


The presented methodology allows an accurate determination 

of remanence flux-density variations above the 

surface of a set of magnets or rotors. A comparison of 

the measured curves with the magnet’s simulated and 

intended remanence flux-density reveals, in which way 

the used FE-magnet-models have to be adopted to be 

used in stochastic considerations of parameter variations 

in electrical machines. Necessary implementations are a 

scalable magnetization strength and an over the magnet 

changing deviation angle. Optional, local errors can be 

considered as well. The resulting magnet parameters 

finally can be used for uncertainty propagation applying 

appropriate tools as stochastic collocation [13] or polynomial 

chaos approaches [14] to propagate the magnet 

deviations onto output sizes of interest. 

V. ACKNOWLEDGEMENT 

The results presented in this paper have been developed 

in the research project Propagation of uncertainties 

across electromagnetic models granted by the Deutsche 

Forschungsgemeinschaft (DFG). 


REFERENCES 

[1] M. Cioffi, A. Formisano, and R. Martone, “Stochastic handling 

of tolerances in robust magnets design,” IEEE Transactions on 

Magnetics, vol. 40, no. 2, pp. 1252 – 1255, march 2004. 

[2] M.-F. Hsieh, C.-K. Lin, D. Dorrell, and P. Wung, “Modeling 

and effects of in-situ magnetization of isotropic ferrite magnet 

motors,” in Energy Conversion Congress and Exposition (ECCE), 

2011 IEEE, sept. 2011, pp. 3278 –3284. 

[3] K.-C. Kim, S.-B. Lim, D.-H. Koo, and J. Lee, “The shape 

design of permanent magnet for permanent magnet synchronous 

motor considering partial demagnetization,” IEEE Transactions 

on Magnetics, vol. 42, no. 10, pp. 3485 –3487, oct. 2006. 

[4] D. Torregrossa, A. Khoobroo, and B. Fahimi, “Prediction of 

acoustic noise and torque pulsation in pm synchronous machines 

with static eccentricity and partial demagnetization using field 

reconstruction method,” IEEE Transactions on Industrial Electronics, 

vol. 59, no. 2, pp. 934 –944, feb. 2012. 

[5] G. Heins, T. Brown, and M. Thiele, “Statistical analysis of the 

effect of magnet placement on cogging torque in fractional pitch 

permanent magnet motors,” IEEE Transactions on Magnetics, 

vol. 47, no. 8, pp. 2142 –2148, aug. 2011. 

[6] F. Jurisch, “Production process based deviations in the orientation 

of anisotropic permanent magnets and their effects onto the operation 

performance of electrical machines and magnetic sensors – 

german –,” International ETG-Kontress Tagungsband, (ETG-FB 

107), no. 1, pp. 255–261, 2007. 

[7] I. Coenen, M. Herranz Gracia, and K. Hameyer, “Influence and 

evaluation of non-ideal manufacturing process on the cogging 

torque of a permanent magnet excited synchronous machine,” 

COMPEL, vol. 30, no. 3, pp. 876–884, 2011. 

[8] M. Hafner, D. Franck, and K. Hameyer, “Accounting for saturation 

in conformal mapping modeling of a permanent magnet 

synchronous machine,” COMPEL, vol. 30, no. 3, pp. 916–928, 

May 2011. 

[9] D. Zarko, D. Ban, and T. Lipo, “Analytical calculation of magnetic 

field distribution in the slotted air gap of a surface permanentmagnet 

motor using complex relative air-gap permeance,” Magnetics, 

IEEE Transactions on, vol. 42, no. 7, pp. 1828 – 1837, 

july 2006. 

[10] H. Electronics, “He244 series analog hall sensor - datasheet,” 

Download from www.hoeben.com, downloaded at 15.08.2012, 


[11] P. Offermann and K. Hameyer, “Non-Linear stochastic variations 

in a magnet evaluated with Monte-Carlo simulation and a polynomial 

Chaos META-Model,” in XXII Symposium on Electromagnetic 

Phenomena in Nonlinear Circuits,EPNC 2012. Pula, 

Croatia: PTETIS Publishers, June 2012, pp. 21–22. 

[12] A. Mohamed Abouelyazied Abdallh, “An inverse problem based 

methodology with uncertainty analysis for the identification of 

magnetic material characteristics of electromagnetic devices,” 

Ph.D. dissertation, Ghent University, 2012. 

[13] E. Rosseel, H. De Gersem, and S. Vandewalle, “Nonlinear 

stochastic Galerkin and collocation methods: application to a 

ferromagnetic cylinder rotating at high speed,” Communications 

in Computational Physics, vol. 8, no. 5, pp. 947–975, 2010. 

[14] B. Sudret, “Uncertainty propagation and sensitivity analysis 

in mechanical modesl – contributions to structural reliability 

and stochastic spectral methods,” Ph.D. dissertation, Universite 

BLAISE PASCAL - Clermont II, Ecole Doctorale Sciences pour 

l’Ingenieur, 2007.


Potential of Spheroids in a Homogeneous 

Magnetic Field in Cartesian Coordinates 

Markus Kraiger∗ and Bernhard Schnizer † 

∗Institute for Radiopharmacy - PET Center, Helmholtz-Zentrum Dresden - Rossendorf e.V., Bautzner Landstr. 400, 

D-01328 Dresden - Schönfeld/Schullwitz, Germany. Email: m.kraiger@hzdr.de 

† Institute for Theoretical Physics - Computational Physics, Technische Universität Graz, Petersg. 16, A-8010 Graz, 

Austria 

E-mail: schnizer@itp.tu-graz.ac.at 

Abstract—The potential and the field of a prolate or an oblate magnetic spheroid in a static homogeneous field are computed 

and expressed in Cartesian coordinates. The directions of both the primary magnetic field and of the symmetry axis are 

completely arbitrary. These expressions are used to investigate trabecular structures built from spheroids having different 

symmetry axes and positions for Magnetic Resonance (MR-) Osteodensitometry. 

Index Terms—Prolate or oblate spheroid in homogeneous field, building flexible models for magnetic resonance imaging or 

spectroscopy. 


In gerneral, the potential of a magnetic spheroid in a given 

external magnetic field is derived in spheroidal coordinates, 

whose symmetry axis is the z-axis. Models of biological tissues, 

as e.g. trabecular bones, are arrays of such spheroids with symmetry 

axes having various directions. Having such applications 

in mind, we derived potential and field expressions for prolate 

and oblate spheroids in a homogeneous field. These expressions 

depend on Cartesian coordinates for arbitrary directions of both 

the field and the symmetry axes. 

II. METHOD OF SOLUTION 

A spheroid (permeability μi = μ0(1 + χi); semi-axes 

a, a, c) is in a medium (permeability μe = μ0(1 + χe)) 

and a static homogeneous field H0 = (H0x,H0y,H0z) = 

H0(sin β cos α, sin β sin α, cos β) of arbitrary direction. At 

first the problem of a prolate spheroid is solved in prolate 

spheroidal coordinates ([1], Fig.1.06) 

x + iy = ep sinh η sin θe iψ 

(1) 

z = ep cosh η cos θ 

or in the corresponding oblate spheroidal coordinates ([1], 

Fig.1.07) 

x + iy = eo cosh η sin θe iψ 

(2) 

z = eo sinh η cos θ. 

for an oblate spheroid as shown e.g. in [2] to [4]. The particular 

solutions of the potential equation are obtained by separation 

giving Legendre functions and polynomials of cosh η, i sinh η 

respectively multiplied by Legendre polynomials of cos θ and 

by trigonometric functions of ψ. A solution of this problem 

is found by the usual method, namely by expanding the 

potential in the interior and in the exterior of the spheroid 

w.r.t. the particular solutions fulfilling the appropriate boundary 

conditions: i) the total potential must be finite at η =0; ii) 

the total potential must agree with that of the primary field 

(5) at η = ∞. The expansion coefficients are determined 

by the continuity conditions that the total potential must be 

continuous Φ0 +Φ σ e =Φ0 +Φ σ i and the corresponding normal 

component of the magnetic induction must be continuous at the 

interface of the two media ((7) with n = ez). The solutions 

contain only Legendre funtions and polynomials of order 1 

since the inhomogeneity (5) is of that order. Thereafter the 

Legendre functions and polynomials may be replaced with 

elementary functions of η and θ. These may be in turn expressed 

by functions of Cartesian coordinates by use of (1), 

(2) respectively and by cosh η = up(r, ez)/ √ 2, sinh η = 

uo(r, ez)/ √ 2, eq.(24) respectively. The expansion coefficients 

L σ 0 ,L σ 1 ,M σ 0 ,M σ 1 obtained from matching the two pieces of 

the potential at the interface are first expressed in Legendre 

functions and polynomials of argument ηp,ηo respectively: 

ηp = Arcoth(cp/ap) (3) 

ηo = Artanh(co/ao). (4) 

The coefficients are also reexpressed in elementary functions 

of these geometrical parameters and by the magnetic susceptibilities 

χe,χi to give eqs.(8) to (11), (13) to (16) respectively. 

In the last step the potential in both domains is transformed to 

an arbitrary direction n of the spheroidal symmetry axis. All 

vectors in the potential are decomposed into vectors parallel to 

or perpendicular to the z-axis. Finally all vectors ez occuring 

in these expressions are replaced by n. 

This description is rather concise; full details may be found 

in the papers [3] and [4] and in the notebooks at the website 

quoted. But the next paragraph gives a complete listing of all 

formulas needed for the applications. 

III. RESULTS 

The primary field is homogeneous with the potential 

Φ0(x, y, z) = − (H0x x + H0y y + H0z z). (5) 

A. The potentials of the reaction fields 

The presence of a spheroid induces a reaction field with 

potential (r =(xβ)) : 

Φ σ k(x, y, z) = 

3X 

α,β=1 

H0αt σ,k 

αβ xβ = H0 · T σ,k · r (6) 

with σ = p (= prolate) or = o (= oblate) and k = e (= external) 

or i (= internal) to the ellipsoid 

Eσ := r2 − (n · r) 2 

a 2 σ 

+ (n · r)2 

c 2 σ 

=1. (7)

For p a prolate spheroid, ap < cp, the excentricity is ep = 

c2 p − a2 p; for an oblate one, co

the additional contribution, originating from the local field 

inhomogeneities, to the effective transversal relaxation rate R ∗ 2. 

Further, R ′ 2 ≈ γΔB with ΔB representing the field variation 

and γ the gyromagnetic ratio. 

B. Theory: Computersimulation 

The aim of the current simulation is to investigate effects on 

the induced line broadening of the resonance spectra evoked 

through micro cracks as examples of trabecular rarefaction. 

Thus, the evaluation of the magnetic field distribution was 

performed utilizing a two-compartment model, consisting of 

marrow and bone. In oder to mimic the known trabecular micro 

structure within a vertebra [13] prolate ellipsoids were arranged 

appropriately within a three-dimensional unit cell. 

The precession frequency of spins in a homogeneous magnetic 

field is determined through the magnetic induction B. 

Hence, in a first step the reaction fields induced by the susceptibility 

difference between the ellipsoids (trabeculae) and the 

background (bone marrow) were computed [14]. 

Introducing a sample with a different susceptibility, in the 

current experiment trabecular bone (χ2) is surrounded by bone 

marrow (χ1), the resulting magnetic induction Bz can be 

generally written as: 

Bz = μ (H0z + Mz (r)) = μ0(1 + χ)(H0z + Mz (r)) , (31) 

with Mz characterising the induced reaction field. Herin the 

units are given in the MKS-system, and susceptibility units are 

per unit volume. 

Since the transversal magnetization decay of mineralized 

bone is several magnitudes faster comparing to bone marrow, 

the received resonance signal in MR-Osteodensitometry is governed 

by the magnetization arising within the marrow. Thus Mz 

corresponds to the computed reaction fields ΔHr1,z caused by 

the difference in magnetic property between bone and marrow. 

The resulting magnetic field distribution within the unit cell 

was determined as the sum of the individual contributions Hzi 

originating from all ellipsoids n: 

nX 

ΔHr1,z (r) = Hzi (r) . (32) 

i=1 

Interactions between the trabeculae have been neglected. This 

assumption is valid, since interactions between such structures 

include susceptibility effects of the second order, which will 

give rise to field contributions of the order of H0 (Δχ) 2 ,or 

≈ H0 · 10 −12 . 

In a simple MR experiment, excitation followed by an 

acquisition period, the signal of the free induction decay (FID) 

can be written as: 

S(t) =const 

Z 

VOI 

with ω(r) =γBz(r) it follows: 

Z 

S(t) =const 

VOI 

d 3 r e −iω(r)t e −T2/t ; (33) 

d 3 r e −iγBz(r)t e −T2/t . (34) 

Using again expression (31) the following expression in 

ΔHr1,z can be found: 

Z 

S(t) =const 

VOI 

d 3 r e −iγtμ0(1+χ)(H0z+ΔHr1,z(r)) e −T2/t . 

(35) 

This integral must be extended over the entire unit cell enclosing 

the ellipsoids. 

In order to compare the simulation results with MR images 

the magnitude of S(t) must be found. Except for the dissipative 

relaxation phenomenon e −T2/t 

the expressions in (35) are 


purely oscillatory in H0z. Hence, for the analysis of the signal 

course the essential decay can be expressed as: 

Z 

|S(t)| = const d 3 r e −iγtμ0(1+χ)ΔHr1,z(r) 

. (36) 

VOI 

ΔHr1,z(r) can be computed according to (32) as the sum 

over all the reactions fields of the individual ellipsoids, where 

μ0(1+χ) describes the magnetic permeability at the location r. 

1) Algorithm: Utilizing the expression developed for the 

reaction field (28) the simulation was implemented in Mathematica 

(Wolfram Research, Inc.). The program computed the 

field distribution of ΔHr1,z(r) in the sense of a histogram and 

generated the MR signal curve according to (36). 

As input parameters the spacing of the trabeculae in x-, 

y- and z-direction, the dimensions of the ellipsoids and the 

position of the symmetry axis with respect to the z-axis 

of the coordinate system had to be defined. Further, the 

susceptibilities of the bones and the background as well as the 

orientation of the applied homogenous main magnetic field had 

to be set. The results of the simulations were the histograms 

of the magnetic field distribution and the signal curve, which 

was further utilized within a fitting-procedure yielding the 

relaxation constant R ′ 2. 

2) Data fitting: Utilizing the simulated signal curves a 

exponential signal model was applied in order to approximate 

the relaxation time T ′ 2 [15]. The computed signal intensities (36) 

at the echo times ranging from 0 to 50 ms, 5 ms increment, were 

used to generate a single T ′ 2 value by means of a non linear 

least-squares-approximation to a two parameter fit function: 

S(t) =Ae −t/T ′ 2 . (37) 

C. Model of vertebra 

The three-dimensional unit cell was composed out of thirty 

prolate ellipsoids, fifteen aligned along the x- and z-direction 

each, mimicing the initial intact trabeculae. The interruptions 

were simulated in the way, that each trabecula was replaced by 

two ellipsoids, which were displaced along the x/z-axis by 50 

μm forming a crack. The configuration of the three-dimensional 

vertebra model and the applied parameter setting are given in 

Fig.1. 

Fig. 1. Depiction of the 3.75 × 3.75 × 3.75 mm 3 unit cell; the 

x/z aligned sets are built up of three planes displaced by 750 μm. 

The trabeculae in each plane were modelled with a trabecular spacing 

and width of 500 μm and 120 μm respectively. The trabecular micro 

fractures were simulated by replacing each of the intact trabeculae with 

two opposed shifted versions.

D. Results 

The resulting reaction fields Hr1 pre- and post bone rarefaction 

are depicted in Fig.2. Note, that the field distribution is 

directly affected by the shape of the micro cracks, whereby the 

resulting field inhomogeneities in the vicinity of the spiky edges 

lead to the observed major field broadening. Prior rarefaction, 

the inital field distribution ranged approximately around ±1 

A/m, afterwards field values from almost ±2 A/m were found 

within the three-dimensional vertebra model. The effect of the 

interrupted bone mesh on the MR signal decay and the resulting 

estimated relaxation time T ′ 2 is presented in Fig.3. The modelled 

cracks gave rise to a change of the initial T ′ 2 of 26.1 ms to 

approximately 14.4 ms. 

Fig. 2. Resulting field distribution of the reaction field Hr1,z within 

the applied three-dimensional vertebra model. The trabecular cracks 

causing a broadening of the distribution, resulting in a more Lorentzian 

like line shape. A main magnetic field H0 =2.38732 · 10 6 A/m with 

α =30 ◦ and β parallel z-axes, and values of χ1 = −0.62·4·π ·10 −6 

and χ2 = −0.9 · 4 · π · 10 −6 were applied. 

V. CONCLUSION 

The advantage of this new approach is that it is very easy 

to build and investigate structures built from spheroids with 

different axes and positions. There is no need of complicated 

coordinate transformations. 

The analytical solutions of the Laplacian potential problem 

of spheroids in Cartesian coordinates were successfully applied. 

Fig. 3. Resulting resonance signal decays affected by the reaction field 

Hr1,z of the vertebra model in the two situations. As a consequence 

of the increasing inhomogeneous reaction field a rapid signal decay in 

case of micro cracks is visible (green curve). The signals are normalized 

to the values at the first echo time TE, markers are indicating the 

computed signal values at TE. 


A three-dimensional magnetostatic problem in the area of MR- 

Osteodensitometry, susceptibility effects in the vicinity of micro 

cracks, was analysed. Within vertebrae affected by pathologies 

such as osteoporosis horizontally arranged structures get typically 

interrupted at first. The novel expressions make it possible 

to study the bone rarefaction along such pathologies, whereby 

either cracks of the horizontal, the vertical or arbitrary structures 

are accessible for modelling. 

In the present work just one application of the analytical 

expressions, the modelling of bone disorders in the area of 

MR-Osteodensitometry, was given. For example in the field 

of functional MRI the devoloped toolbox eases the analysis of 

the BOLD (blood oxygenation level-dependent) contrast, where 

induced reaction fields in the surrounding of vascular networks 

are of great interest [16]. A fast and precise computation of the 

magnetic distortion is essential for improving the precision of 

the temperature determination in techniques using the proton 

resonance frequency (PRF) shift method [17], [18]. Temperature 

mapping in the vicinity of the needle electrode is a crucial 

determinant of MRI guided interventional radiofrequency ablations 

[19]. Further, in the field of metabolism studies using 

NMR spectroscopy (MRS) the expressions can be used in 

order to model specific cells introduced in solutes differing in 

magnetic susceptibility. [20]. 

In summary, the authors believe that the novel formulation 

of solutions depending solely on the Cartesian coordinates will 

facilitate the modelling of countless magnetostatic problems. 

REFERENCES 

[1] P. Moon and D.E. Spencer: Field Theory Handbook. Including 

coordinate systems, differential equations and their solutions. 

Springer 1988. 

[2] P. W. Kuchel, and B. T. Birman, ”Perturbation of Homogeneous 

Magnetic Fields by Isolated Single and Confocal Spheroids. Implications 

for NMR Spectroscopy of Cells,” NMR in Biomedicine, 

vol.2 (4) pp. 151-160, 1989. 

[3] M. Kraiger, and B. Schnizer, ”Potential and Field of a Homogeneous 

Magnetic Spheroid of Arbitrary Direction in a Homogeneous 

Magnetic Field in Cartesian Coordinates,” to appear in COMPEL, 

2012. 

[4] M. Kraiger, and B. Schnizer, ”Reaction Fields of a Homogeneous 

Magnetic Spheroids of Arbitrary Direction in a Homogeneous 

Magnetic Field. A Toolbox for MRI and MRS of Heterogeneous 

Tissue.” Report ITPR-2011-021. Institute for Theoretical 

and Computational Physics. Technische Universität Graz, Austria. 

http://itp.tugraz.at/∼schnizer/MedicalPhysics/ 

[5] F. W. Wehrli, H. K. Song, P. K. Saha, and A. C. Wright, ”Quantitative 

MRI of the assessment of bone structure and function,” NMR 

in Biomedicine, vol. 19 pp. 731-764, 2006. 

[6] C. A. Davis, H. K. Genant, and J. S. Dunham, ”The effects of 

bone on proton NMR relaxation times of surrounding liquids,” 

Investigative Radiology, vol. 21 pp. 472-477, 1986. 

[7] S. Grampp, S. Majumdar, M. Jergas, P. Lang, A. Gies, and HK. 

Genant, ”MRI of bone marrow in the distal radius: in vivo precision 

of effective transverse relaxation times,” European Radiology, vol. 

5 pp. 43-48, 1995. 

[8] T. M. Link, J. C. Lin, D. Newitt, N. Meier, S. Waldt, and S. 

Majumdar, ”Computergestützte Strukturanalyse des trabekulären 

Knochens in der Osteoporosediagnostik,” Der Radiologe, vol. 38 

pp. 853-859 , 1998. 

[9] M. H. Arokoski, J. P. Arokoski, P. Vainio, L. H. Niemitukia, 

H. Kroeger, and J. S. Jurvelin, ”Comparison of DXA and MRI 

methods for interpreting femoral neck bone mineral density,” 

Journal of Clinical Densitometry, vol. 5 pp. 289-296. 2002. 

[10] H. Chung, F. W. Wehrli, J. L. Williams, and S. D. Kugelmass, 

”Relationship between NMR transverse relaxation, trabecular bone 

architecture and strength,” Proceedings of the National Academy 

of Sciences, vol. 90 pp. 10250-10254, 1993. 

[11] T. B. Brismar, T. Hindmarsh, and H. Ringertz, ”Experimental 

correlation between T2* and ultimate compressive strength in 

lumbar porcine vertebrae,” Academic Radiology, vol. 4 pp. 426- 

430, 1997.

[12] O. Beuf, D. C. Newitt, L. Mosekilde, and S. Majumdar, ”Trabecular 

Structure Assessment in Lumbar Vertebrae Specimens Using 

Quantitative Magnetic Resonance Imaging and Relationship with 

Mechanical Competence,” Journal of Bone and Mineral Research, 

vol. 16 pp. 1511-1519, 2001. 

[13] T. Hildebrand, A. Laib, R. Müller, J. Dequeker, and P. Regsegger, 

”Direct Three-Dimensional Morphometric Analysis of Human 

Cancellous Bone: Microstructural Data from Spine, Femur, Iliac 

Crest, and Calcaneus,” Journal of Bone and Mineral Research, vol. 

14 pp. 1167-1174, 1999. 

[14] C. J. C. Bakker, R. Bhagwandien, M. A. Moerland, and M. 

Fuderer, ”Susceptibility artifacts in 2D FT spin-echo and gradientecho 

imaging: the cylinder model revisted,” Magnetic Resonance 

Imaging, vol. 11 pp. 539-548, 1992. 

[15] A. Fransson, S. Grampp, and H. Imhof, ”Effects of trabecular 

bone on marrow relaxation in the tibia,” Magnetic Resonance 

Imaging, vol. 17 pp. 69-82, 1998. 

[16] S. Ogawa, T. M. Lee, A. R. Kay, and D. W. Tank, ”Brain magnetic 

resonance imaging with contrast dependent on blood oxygenation,” 

Proceedings of the National Academy of Sciences of the United 

States of America, vol. 87 pp. 9868-9872, 1990. 

[17] J. C. Hindman, ”Proton resonance shift of water in gas and liquid 

states,” Journal of Chemical Physics, vol. 44 pp. 4582-4592, 1966. 

[18] V. Rieke, and K. B. Pauly, ”MR Thermometry,” Journal of 

Magnetic Resonance, vol. 27 pp. 376-390, 2008. 

[19] A. Boss, H. Graf, B. Müller-Bierl, S. Clasen, D. Schmidt, P. 

L. Pereira, and F. Schick, ”Magnetic susceptibility effects on the 

accuracy of MR temperature monitoring by the proton resonance 

frequency method,” Journal of Magnetic Resonance Imaging, vol. 

22 pp. 813-820, 2005. 

[20] P. W. Kuchel, ”Red cell metabolism: studies using NMR spectroscopy,” 

Proceedings of Australian Biochemistry Society, vol. 15 

pp. P5-P6, 1983. 



Application of Signal Processing Tools for Fault 

Diagnosis in Induction Motors-A Review 

*Jawad Faiz, *Amir Masoud Takbash, *Bashir Mahdi Ebrahimi and †Subhasis Nandi 

*Center of Excellence on Applied Electromagnetic Systems, School of Electrical and Computer Engineering, 

College of Engineering, University of Tehran, Tehran, Iran 

†Department of Electrical and Computer Engineering, University of Victoria, Victoria, BCV8W 3P6, Canada 

E-mail: jfaiz@ut.ac.ir 

Abstract—Use of efficient signal processing tools (SPTs) to extract proper indices for fault detection in induction motors (IMs) 

is the essential part of any fault recognition procedure. In this paper, all utilized SPTs employed in fault identification of IMs 

are analyzed in details. Then, their competency and their drawbacks for extracting indices in transient and steady-state modes 

are criticized from different aspects. The considerable experimental results are used to certificate demonstrated discussion. 

Different kinds of faults including eccentricity, broken bars and bearing faults as major internal faults in IMs, are 

investigated. 

Index Terms—Fault detection, Fast Fourier Transform, Hilbert, Wavelet Transform. 


Ever increasing application of induction motors (IMs) 

and importance of its uninterrupted operation in 

production lines make it necessary to diagnose internal 

faults in IMs quickly and precisely. The internal faults in 

IMs consist of electrical and mechanical faults. Electrical 

faults occur in the stator and rotor. Electrical faults of 

squirrel-cage rotor of IM consist of bars and end-rings 

breakage which are about 10% of the internal faults of 

squirrel-cage IM [1]. The reasons for these faults are as 

follows: 

1. Thermal stresses due to over-load and asymmetrical 

dissipation of heat which may change the hot spot. 

2. Magnetic stresses arising from electromagnetic 

forces. 

3. Mechanical stresses due to mechanical fatigue of 

different parts, bearing damage, etc. 

4. Stresses due to assembling process and centrifugal 

forces arising from shaft torque. 

Some impacts of broken bars on IM are: Increasing 

core losses and total losses in faulty machine [2]-[4] and 

asymmetrical vector diagram of rotor current [2]. Broken 

bars and end rings faults have been studied more than 

other internal faults of IM. The faulty motor is studied 

using experimental or modeling and simulation methods. 

Following analysis of faulty motor by test or modeling, a 

proper signal must be selected. The signals used in the 

fault diagnosis process, consist of mechanical and 

electrical signals. There are three following reasons that 

make the stator current an appropriate signal for fault 

diagnosis:1) unique effect of motor internal fault on this 

signal. 2) there is no need to have sensor for monitoring 

the signal. 3) this method is economical. After testing or 

modeling motor and selecting a proper signal, it must be 

processed and effect of the proposed fault upon the signal 

is determined. The signal processing methods are based 

on the mathematical transformations. The well-known 

transformations are Fourier, Wavelet, Hilbert and 

multiple signal classification (Music). These processors 

are widely used in the fault diagnosis; however, recently 

intelligent methods such as Genetic, Fuzzy and Neural 

Network algorithms have been applied to make fault 

diagnosis methods more efficient [5]-[7]. Thus 

considering fault type, load conditions and the proposed 

processor characteristics, a particular processor will be 

suitable for each case. To choose an appropriate 

processor, different faults and operating conditions such 

as load are considered. 

II. FAST FOURIER TRANSFORM 

Fourier transform expresses signal as a sum of 

sinusoidal functions. This transform expresses a timedomain 

signal to frequency-domain signal. This transform 

determines the frequency components arising from the 

fault. In application of Fourier transform to a signal, the 

signal must have two basic features: stability and 

alternating. A fast Fourier transform (FFT) is a faster 

version of the discrete Fourier transform 

(DFT). Application of these processors includes sampling 

and applying Fourier transform. Sampling has a series of 

rules and laws as described in [7]. 

A. Rotor Bars and End Ring Breakage Fault Diagnosis 

For broken bars and end ring fault diagnosis in IM, 

FFT base processor is often used and frequency spectrum 

of torque, speed, instantaneous power, body vibration and 

stator current signals are obtained. Torque signal has been 

employed as reference for fault diagnosis in [3]. 

Harmonics 2sfs are produced in the torque frequency 

spectrum which used for fault detection [8]. Some 

references use the speed signal for fault diagnosis and 

here harmonics 2sfs of frequency spectrum are again 

proposed. Figure 1 exhibits the frequency spectrum of 

speed signal and its variations due to the broken rotor 

bars [3]. Torque and speed signals depend on the external 

factors such as load and this makes hard to diagnose the 

fault. Also the procedure for acquiring these signals is 

important, because using sensors and other devices affect 

the accuracy of the operation. Another signal that is 

considered for fault diagnosis is the case vibration signal 

The reason for this vibration is air gap radial

Figure 1: Frequency spectrum of motor speed for 

different numbers of rotor broken bars [4]. 

electromagnetic forces. Broken bars lead to the odd 

harmonics in the frequency spectrum of vibration signal. 

Although signal with twice supply frequency has been 

used for fault diagnosis, this signal is not suitable because 

it also appears in the healthy motor vibration frequency 

spectrum. The above-mentioned signal depends on the 

load and its detection requires a sensor [9]. On the other 

hand, Fourier transforms application to the transient 

signals such as speed and vibration does not yield 

accurate results. Pendulum oscillations and increment of 

 

[10], [11 

proposed in [11] with some simplifying assumptions. As 

seen, broken bars generate 2sfs 

[10]. Instantaneous power signal can be also utilized for 

broken bars and end rings fault diagnosis. Advantages of 

using instantaneous power spectrum are listed in [12]. 

Output voltage harmonics of motor following the power 

supply interruption can be used to diagnose the fault. The 

main idea of this method is eliminating the harmonics 

generated by the voltage supply [13]. However, this 

signal is a transient signal and FFT application on this 

signal leads to inaccurate fault detection. Intelligent 

algorithms can be used to diagnose the fault through 

current signal envelop [14]. The drawback of this method 

is that the harmonics of the envelop signal depends on the 

severity of the rotor bar fault as well as their locations 

[13]. Current signal has been considered as the most 

appropriate signal for internal fault diagnosis. Some 

references [15]-[17] use time signal of the line current for 

fault diagnosis but this fault is often detected through line 

current harmonics [1]-[3]. The most important harmonics 

used for fault diagnosis are (1±2s)fs. The amplitudes of 

these harmonics are larger than that of other harmonics 

and their diagnosis is easier. Amplitude of harmonic (1- 

2s)fs depends on the rotor broken bars fault and its 

intensity and harmonic (1+2s)fs is mostly depends on the 

speed variations [18]. It is noted that harmonic (1-2s)fs 

may be disappeared when broken bars has 90 degrees 

increased by the broken bars fault. The reason for such 

amplitude rising is asymmetry of the rotor due to the fault 

and consequently generating a negative rotating field 

[22]. Table I shows these harmonics before and after the 


TABLE I 

AMPLITUDES OF CURRENT SIDEBANDS FOR MOTOR WITH DIFFERENT ROTOR 

BROKEN BARS [4] 

NBB fs+2fr fs-2fr 

0 -58 -57 

1 -54 -55 

2 -53 -48 

3 -48 -42 

4 -46 -40 

Figure 2: Frequency spectrum of stator current for broken 

end-ring [23]. 

fault versus number of broken bars (NBB) [3]. However, 

raising the fault degree produces lower changes in the 

amplitudes of the sidebands. The reason is increasing the 

number of parallel paths of currents and saturation due to 

asymmetry of the currents passing the bars. Influence of 

bars inner current in the broken bars fault has been 

proposed in [23] and its effects consisting of harmonics 

amplitude reduction has been mathematically proved. In 

[22], broken end-ring has been considered. Figure 2 

presents stator current frequency spectrum for such a 

case. Starting current signal may be used for fault 

diagnosis [3] where broken bars generate harmonics 

(3±4k)fr in the current spectrum (Figure 3). Frequency 

spectrum of starting transient current signal is determined 

STFT in which the problem of processor with the 

transient signal is solved. However, dimensions of the 

window are fixed and therefore it has not good frequency 

and time resolution at the same time [24]. Sometimes 

current is indirectly used, for instance Park transform of 

stator current has been used for fault diagnosis [25]. Of 

course, this method has some drawbacks such as no-clear 

fault effect and susceptible to noise, so it seems that 

application of this method beside other techniques such as 

intelligent methods is useful. However, in this case a set 

of full data is necessary. Park transform of stator current 

leads to iD+jiQ Modulus and harmonics arising from the 

broken bars fault in line current are as 2sfs and 4sfs. The 

advantage of these harmonics is that these are far from the 

fundamental harmonic so its detection is simple. 

B. Impacts of Load Variation 

Side-band components vary with the load torque 

fluctuations [26], [27]. Figure 4 shows the impact of the 

load upon the high and low side-bands of the stator 

current spatial vector [21]. Load fluctuation decreases the 

amplitude of low-band and increases the amplitude of 

high-band. 

C. Impact of Drive 

In the presence of drive and closed-loop circuits the 

situation differs. In PWM-driven motor odd harmonics as

Figure 3: Frequency spectrum of starting current: (a) 

healthy motor, (b) motor with 4 rotor broken bars [3]. 

Figure 4: Impact of load upon high and low side-bands of 

stator current spatial vector [21]. 

well as third harmonic are injected to the motor. These 

harmonics are fb=(m±2nks)fs (m=supply odd harmonicorders, 

n=odd harmonics due to rotor induced currents 

and k=integer number) and odd-order harmonics currents 

are induced in the rotor, that subsequently produces oddorder 

rotor flux in the air gap. Therefore, a new frequency 

pattern is introduced in faulty motors under PWM supply 

[28]. In the closed-loop drive, mutual effects of electrical 

and mechanical oscillations amplify each other and 

amplitude of the above-mentioned frequency spectrum 

increases. Figure 5 shows rotor asymmetry signature in 

inverter-fed motor line current spectrum for healthy motor 

and motor with broken bars [29]. 

D. Impact of Broken Rotor Bars Location 

Rotor bar location and its impact upon the fault 

diagnosis have been investigated and reported in [30] and 

effect of the broken bars location on the waveform and 

frequency spectrum of stator current and side-bands 

components (Figure 6) have been given. Influence of the 

broken bars location on the amplitude of the torque 

harmonics has been pointed out in [28]. Amplitude of 

torque harmonic is increased by more concentration of the 

broken bars [31]. 

III. WAVELET TRANSFORM 

Wavelet transform is a method that transforms the 

signal to time and frequency spectrum. This transform is 

based on transforming a signal to different kinds of scaled 

and shifted of mother wavelet function [13]. Wavelet 

transform enables to show some characteristics of the 


Figure 5: Rotor asymmetry signature in inverter-fed 

motor line current spectrum (a) around fundamental, (b) 

around fifth and seventh harmonics [29]. 

signal such as non-continuity of high-order derivatives of 

the function and sharp point of maximum of the function 

that cannot be shown by other transforms, because they 

eliminate these characteristics during transform [13]. 

Considering the above-mentioned points, wavelet 

transform gives a detailed and fully localized view of the 

function. Having frequency components caused by the 

internal fault of the motor, this transform can concentrate 

on particular regions and this can enhance the precision, 

while Fourier series provides a general view over a period 

of signal [12]. 


Various wavelet transforms have been so far used for 

fault diagnosis. Most of these methods are based on the 

sidebands components of frequency spectrum of the 

current signal. In [28], energy of a bandwidth is used to 

diagnose the fault in which the load impact is also taken 

into account. Since discrete wavelet transform (DWT) has 

a better clarity over the low frequencies, the use of the 

current spatial vector which has harmonics with lower 

frequencies will yield more precise results [32]. In [33], a 

method based on CWT has been used to diagnose the 

fault in different drives. However, there is no physical 

interpretation for fault diagnosis using the Figurers. In 

[34], power spectral density (PSD) values of details signal 

in any level of transform is fault diagnosis criterion. 

Figure 7 shows the pattern of current signal wavelet 

transform of healthy and rotor broken bars motor [34]. In 

[35], the reason for application of DWT in the papers has 

been noted. There are not suitable physical description for 

results, complicated trend and algorithm of other wavelet 

methods and ambiguous results. In [35], fault has been 

diagnosed using envelope of the starting current signal 

and procedure has been introduced for extracting the 

envelope signal considering the impact of the broken bars 

on the settling time and amplitude of the envelope of the 

starting current. Also determination of wavelet main 

function is important in fault diagnosis. Harmonics due to 

torque ripples and unbalanced voltage generate harmonics 

similar to that of the broken bar and this reduces the


Figure 6: Impact of bars location on amplitude of sidebands; (a) three broken bars in one pole and one broken bar in 

adjacent pole, (b) Two broken bars in one pole and two broken bars in adjacent pole, (c) One bar under each pole [30]. 

a b 

Figure 7: Pattern of current signal wavelet transform, (a) 

healthy, (b) rotor broken bar motor [34]. 

accuracy of the fault diagnosis process. However, this can 

be solved by application of DWT transform [36]. The 

analytical wavelet transform (AWT) is one of the wavelet 

transforms which has been used to diagnose the rotor 

broken bars fault. Advantage of this wavelet transform is 

keeping the characteristics of time domain, amplitude and 

phase as well as frequency. Amplitude is related to the 

proposed signal envelope and the phase is related to the 

time characteristics of the signal. In [37], AWT has been 

used to diagnose the rotor broken bars fault by the help of 

starting signal of the IM under low level loads. 

B. Impacts of Load Variation 

Impact of load fluctuations on wavelet coefficients of 

the stator current spectrum of a motor under broken bars 

fault has been studied in [38]. Table II summarizes the 

variations of D4 coefficient and values of a function (that) 

defined in the reference. The un-decimated discrete 

wavelet transform (UDWT) is a type of DWT in which 

shift- invariant has been included. This leads to a good 

time precision over high frequency harmonics, and good 

time and frequency precision over low frequency 

harmonics. In addition to DWT and CWT, there is 

another wavelet called wavelet packet decomposition 

(WPD), which yields more precise results but it is time 

consuming method [37], [39]. Sidebands move to higherorder 

nodes WPD transform due to load fluctuations [39]. 

One important point in the application of this transform is 

the use of a proper node for fault diagnosis. For high 

loads the low-order nodes and for low loads high-order 

nodes are investigated [39]. In [40], the impact of the 

drive in broken bar diagnosis using wavelet transform has 

been proposed. Although fault diagnosis procedure and 

load impact have been considered in the above-mentioned 

reference, the location of the broken bars has not been 

taken into account. 

TABLE II 

VARIATIONS OF D4 COEFFICIENT OF WAVELET TRANSFORM OF CURRENT 

SIGNAL OF MOTOR UNDER BROKEN GAR FAULT AGAINST LOAD [38]. 

% of rated 

load 

Mean Current 

(A) 

Mean distortion in D4 Index2 

0 9.54 0.0923 0.97% 

33 8.92 0.3220 3.61% 

71 8.81 0.4044 4.59% 

100 8.74 0.5469 6.25% 

133 8.57 0.5674 6.62% 

IV. HILBERT PROCESSOR 

Due to the drawbacks of the above-mentioned 

processors some new processors such as Hilbert processor 

have been introduced. Hilbert transform similar with 

Fourier transform is orthogonal in respect of its main 

transform. In addition one function and its Hilbert 

transform has identical energy. One type of Hilbert 

transform is the Hilbert- Huang transform (HHT). In 

HHT the energy distribution in time-frequency domain is 

obtained by estimation of the local energy of signal in 

different times and frequencies. On contrary to other 

time- frequency transforms which depend on the size of 

window and sampling frequency in Fourier transform and 

mother wavelet in wavelet transform, HHT is independent 

on aforementioned parameters, that is an advantage of this 

transform. No-load and light load cases in rotor broken 

bars and end-rings have been emphasized in [41]. 


In no-load IM there is no harmonic arising from the 

load, but harmonics are very close to the fundamental 

frequency. Here a Hilbert vector is defined for signal and 

using this vector instead of the proposed signal has some 

advantages: 

1. Requirement of phase current. 

2. Generation of harmonic components due to fault and 

deletion of non-applicable harmonics. 

Figure 8: Hilbert modulus: (a) healthy motor, (b) motor 

with two broken bars [42]. 

3. Elimination of frequency scattering. 

4. Nonexistence of fundamental frequency that lets to

use linear scale on the vertical axes instead of 

logarithm scale and this clarifies the graphs. 

5. No need to sample with twice of Nayquest 

frequency. 

Considering the proposed low frequencies the sampling 

speed is reduced to 0.1 of the normal case values and this 

is useful in practice. In [42], a fault diagnosis method 

based on fundamental harmonic deletion and 

determination of Hilbert Modulus has been introduced. 

Figure 8 shows Hilbert modulus for a healthy motor 

and a motor with broken bars. By increasing the fault 

degree, this modulus becomes larger due to the 

harmonics. In the following part a dimensionless 

numerical criterion with a low dependency on the load is 

introduced. 

V. MUSIC PROCESSOR 

Recently some methods with high frequency precision 

such as Music and Root-Music have been proposed [43]. 

These methods are used where keeping a particular 

frequency is necessary. So in these methods, precise 

information on frequency components are required [19]. 

This mathematical transform is similar with other 

transforms and it converts a signal to sum of several 

signals with identical feature. Music transform consists of 

transform of a signal and expressing it based on K 

complex sinusoidal pair and a signal e(n). 

C. Rotor Bars and End Ring Breakage Fault Diagnosis 

Combination of FFT and Music methods and a 

method of fundamental harmonic deletion have been used 

for fault diagnosis in [44]; because lonely application of 

Music leads to error. This method provides clearer results 

compared to FFT method. In Figure 9 the results of two 

methods have been compared [44]. In [45], frequency 

spectrum of output voltage after disconnecting the input 

supply obtained by FFT and Music methods and they 

compared. It has been shown that a series of particular 

harmonics in the frequency spectrum is excited due to the 

broken bars and variations of these variations are seen. 

Reliability and low impact of noise are the advantages of 

this method compared to FFT method [45]. Music- based 

methods similar with Hilbert transform are new methods 

which have precise results and computation is quicker 

than Wavelet method. However, it needs improved 

algorithms for deletion of the fundamental harmonic 

which complicates these methods. Since these methods 

are new, many fault diagnosis indexes have not been yet 

modeled by these methods. 


Different methods and processors used for diagnosis 

of rotor bars and end ring breakage fault in IMs were 

investigated briefly. At this end, four types of processors 

and their advantages and drawbacks were studied. It is 

clear that a single method and a common processor 

cannot be specified for fault detection in all conditions. 

Fourier processor as a most applied processor for broken 


Figure 9: Comparison of FFT and Music-based methods 

for motor with 1 broken bar: (a) FFT, (b) Music [44]. 

bars fault has weak and strong points. Its most important 

weakness is in processing of transient signals. To 

overcome this problem, application of wavelet processor 

was suggested which provide more detailed time and 

frequency view of the signal. Following wavelet packet 

with simultaneous high precision of time and frequency is 

commonly used. These processors often are used for 

broken bars fault but there are no appropriate researches 

about number of broken bars and their location. Other 

drawbacks of this method are that it is time consuming 

and complicated. In recent years, Hilbert-based methods 

with high frequency precision methods such as Music 

have been proposed. A common point that must be taken 

into account in an appropriate fault diagnosis method in 

industry beside on-line case; is the method must quick 

and at the same time have good accuracy. 

REFERENCES 

[1] J. Faiz, B. M. Ebrahimi, and M. B. B. Sharifian, “Different faults 

and their diagnosis techniques in three-phase squirrel-cage 

induction motors-a review, “ Electromagnetics, vol. 26, no. 7, pp. 

543–569, October 2006. 

[2] X. Ying, “Characteristic performance analysis of squirrel cage 

induction motor with broken bars,” IEEE Trans. on Magnetics, 

vol. 45, no. 2, pp. 759-766, Feb. 2009. 

[3] J. Faiz, B. M. Ebrahimi, "A new pattern for detecting broken rotor 

bars in induction motors during start-up," IEEE Trans. on 

Magnetics, vol. 44, no. 12, pp. 4673-4683 ,December 2008. 

[4] J. Faiz, and B. M. Ebrahimi, "Determination of number of rotor 

broken bars and static eccentricity degree in induction motor 

under mixed fault," Electromagnetics, vol. 28, no. 6, pp. 433-449, 

August 2008. 

[5] M. J. Devaney and L. Eren, "Monitoring an induction motors 

current and detecting bearing failure," Trans. on Instrumentation 

and Measurement, vol.7, no.4, pp. 30- 50 December 2004. 

[6] B. Li, M. Y. Chow, Y. Tipusuwan, and J. C. Hung , "Neuralnetwork-based 

motor rolling bearing fault diagnosis,", IEEE 

Trans. on Industrial Electronics, pp. 1060-1069, vol. 47, no.5, 

Oct. 2000. 

[7] J. Faiz, and I. Tabatabaei, " Extension of winding function theory 

for non uniform Air gap in eccentric machinery," IEEE 

Transaction on Magnetics, vol. 38, no. 6, pp. 3654-3657, 


[8] J. F. Bangura, R. J. Povinelli, N. A. Demerdash, and R. H. Brown, 

"Diagnostics of eccentricities and bar/end ring connector 

breakages in poly phase induction motors through a combination 

of time-series data mining and time-stepping coupled FE–statespace 

techniques," IEEE Trans. on Industry Applications, vol. 39, 

no. 4, pp. 1005-1013, Jul./Aug. 2003. 

[9] P. J. Rodriguez, A. Belahcen, and A. Arkkio, "Signature of 

electrical faults in the force distribution and vibration pattern of 

induction motors," IEE Proc.-Electr. Power Appl., vol. 153, no. 4, 

pp. 523-529, July 2006. 

[10] C. Yeh, G. Y. Sizov, A. S. Ahmed, N. A. O. Demerdash, R. J. 

Povinelli, E. E. Yaz, and D. M. Ionel, "A reconfigurable motor for 

experimental emulation of stator winding inter turn and broken 

bar faults in poly-phase induction machines," IEEE Trans. on

Energy Conversion, vol. 23, no. 4, pp. 1005-1014, December 

2008. 

[11] B. Mirafzal, and N. A. O. Demerdash, "Effects of load magnitude 

on diagnosing broken bar faults in induction motors using the 

pendulous oscillation of the rotor magnetic field orientation," 

IEEE Trans. on Industry Applications, vol. 41, no. 3, pp. 771- 

783, May/June 2005. 

[12] G. Didier, E. Ternisien, O. Caspary, and H. Razik, "Fault 

detection of broken rotor bars in induction motor using a global 

fault index," IEEE Trans. on Ind. Appl., vol. 42, no. 1, pp. 79-88, 

Jan./Feb. 2006. 

[13] J. Milimonfared, H. Meshginkelk, S. Nandi, A. D. Minassians, 

and H. A. Toliyat, "A novel approach for broken-rotor-bar 

detection in cage induction motors," IEEE Trans. on Ind. Appl. , 

vol.35, no. 5, pp. 1000-1006, Sep./Oct. 1999. 

[14] A. M. D. Silva, R. J. Povinelli, and N. A. O. Demerdash, 

"Induction machine broken bar and end stator short-circuit fault 

diagnostics based on three-phase stator current envelopes," IEEE 

Trans. on Industrial Electronics, vol. 55, no. 3, pp. 1310-1318, 

March 2008. 

[15] C. E. Kim, Y. B. Jung, S. B. Yoon, and D. Im, "The fault 

diagnosis of rotor bars in squirrel cage induction motors by timestepping 

finite element method," IEEE Trans. on Magnetics, vol. 

33, no. 2, pp. 2131-2134, March 1997. 

[16] N. M. Elkasabgy, A. R. Eastham, and G. E. Dawson, "Detection 

of broken bars in the cage rotor on an induction machine," IEEE 

Trans. on Industry Applications, vol. 22, no. 6, pp. 165-171, Jan./ 

Feb. 1992. 

[17] V. F. Pires, J. F. Martins, and A. J. Pires, "Eigen vector / Eigen 

value analysis of a 3D current referential fault detection and 

diagnosis of an induction motor," Energy Conversion and 

Management, vol. 51, issue. 5, pp. 901-907, May 2010. 

[18] S. Nandi, H. A. Toliyat, and X. Li, "Condition monitoring and 

fault diagnosis of electrical motors- a review," IEEE Trans. on 

Energy Conversion, vol. 20, no. 4, pp. 719-729, December 2005. 

[19] M. E. H. Benbouzid, and G. B. Kilman, "What stator current 

processing-based technique to use for induction motor rotor faults 

diagnosis?," IEEE Trans. on Energy Conversion, vol. 18, no. 2, 

pp. 238-244, June 2003. 

[20] M. Riera-Guasp, M. F. Cabanas, J. A. Antonino-Daviu, M. 

Pineda-Sanchez, and C. H. R Garcia, "Influence of 

Nonconsecutive Bar Breakages in Motor Current Signature 

Analysis for the Diagnosis of Rotor Faults in Induction Motors," 

IEEE Trans. on Energy Conversion, vol. 25, no.1, pp. 80-89, 

Jan./ Feb. 2010. 

[21] A. Bellini, F. Filippetti, G. Franceschini, C. Tassoni, and G. B. 

Kilman, "Quantitative evaluation of induction motor broken bars 

by means of electrical signature analysis," IEEE Trans. on 

Industry Applications, vol. 37, no. 5, pp. 1248-1255, Sep./Oct. 

2001. 

[22] J. F. Bangura, and N. A. Demerdash, "Diagnosis and 

characterization of effects of broken bars and connectors in 

squirrel-cage induction motors by a time-stepping coupled finite 

Element-state space modeling approach," IEEE Trans. on Energy 

Conversion, vol. 14, no. 4, pp. 1167-1176, December 1999. 

[23] R. F. Walliser, and C. F. Landy, "Determination of inter bar 

current effects in the detection of broken rotor bars in squirrel 

cage induction motors," IEEE Trans. on Energy Conversion, vol. 

9, no. 1, pp. 152-158, March 1994. 

[24] P. V. Goode, and M. Chow, "Using a neural/fuzzy system to 

extract heuristic knowledge of incident faults in induction motors: 

part II-application," IEEE Trans. on Industrial Electronics, vol. 

42, no. 2, pp. 139-146 , April 1995. 

[25] M. Haji, and H. A. Toliyat, "Pattern recognition- a technique for 

induction machine rotor broken bar detection," IEEE Trans. on 

Energy Conversion , vol. 16, no. 4, pp. 312-317, December 2001. 

[26] J. Faiz, B. M. Ebrahimi, and M. B. B. Sharifian, "Time Stepping 

Finite Element Analysis Of Broken Bars Fault In A Three-Phase 

Squirrel-Cage Induction Motor," Progress in Electromagnetics 

Research, PIER, vol. 68, pp. 53-70, 2007. 

[27] J. Faiz, B. M. Ebrahimi, H.A. Toliyat, and W.S. Abu-Elhaija, 

"Mixed – fault diagnosis in induction motors considering varying 

load and broken bars location," Energy Conversion and 

Management, vol. 51, issue. 7, pp. 1432-1441, July 2010. 

[28] D. F. Pires, V. F. Pires, J.F. Martins, and A.J. Pires, "Rotor cage 

fault diagnosis in three-phase induction motors based on a current 


and virtual flux approach," Energy Conversion and Management, 

vol. 50, issue. 4, pp. 1026-1032, April 2009. 

[29] B. Akin, U. Orguner, H. A. Toliyat, and M. Rayner, "Low order 

PWM inverter harmonics contributions to the inverter-fed 

induction machine fault diagnosis," IEEE Trans. on Industrial 

electronics, vol. 55, no. 2, pp. 610-619, February 2008. 

[30] J. Faiz, and B. M. Ebrahimi, "Locating rotor broken bars in 

induction motors using finite element method," Energy 

Conversion and Management, vol. 50, pp. 125-131, August 2008. 

[31] S. H. Kia, H. Henano, and G. A. Capolino, "Diagnosis of Brokenbar 

Fault in Induction Machines Using Discrete Wavelet 

Transform Without Slip Estimation," IEEE Trans. on Industry 

Applications, vol. 45, no. 4, pp. 1395-1404, July/August 2009. 

[32] Z. K. Peng, M. R. Jackson , J. A. Rongong , F. L. Chu , and R. 

M. Parkin, " On the energy leakage of discrete wavelet transform," 

Mechanical Systems and Signal Processing, vol. 29, pp. 330-343, 

2009. 

[33] I. P. Georgakopoulos, E. D. Mitronikas, and A. N. Safacas, " 

Condition Monitoring of an Inverter-Driven Induction Motor 

Using Wavelets," Advanced Electromechanical Motion Systems 

& Electric Drives Joint Symposium, 2009. ELECTROMOTION 

2009. 8th International Symposium on, pp.1-5, 1-3 July 2009. 

[34] J. Cusido, L. romeral, J. A. Ortega, A. Rosero and A. G. 

Espinosa, "Fault Detection in Induction Machines Using Power 

Spectral Density in Wavelet decomposition," IEEE Trans. on 

Industrial Electronics, vol. 55, no. 2, pp. 633-643, February 2008. 

[35] Randy Supangat, NesimiErtugrul, Wen l.Soog, Douglas A.Gray 

and Colin Hasen, "Broken Rotor Bar fault Detection in induction 

Motors Using Starting Current Analysis," EPE 2005, pp. 1-1, 

2005. 

[36] J. A. Daviu, M. R. Guasp, J. R. Floch, and P. M. Palomares, " 

Validation of a New Method for the Diagnosis of Rotor Bar 

Failures via Wavelet Transform in Industrial Induction 

Machines," IEEE Trans. on Industry Applications, vol. 42, no. 4, 

pp. 990-996, July/August 2006. 

[37] Z. Ye, B. Wu, and A. Sadeghian, "Current Signature Analysis of 

Induction Motor Mechanical Faults by Wavelet Packet 

Decomposition," IEEE Trans. on Industrial Electronics, vol. 50, 

no. 4, pp. 1217- 1228, December 2003. 

[38] J. Faiz, B.M. Ebrahimi, B. Asaie, R. Rajabioun, and H. A. 

Toliyat, "A Criterion Function for Broken Bar Fault Diagnosis in 

Induction Motor under Load Variation using Wavelet Transform," 

Electrical Machines and Systems, 2007. ICEMS. International 

Conference on, pp.1249-1254, 8-11 Oct. 2007. 

[39] S. A. Saleh, T.S. Radwan, and M.A. Rahman, "Real- Time of 

WPT-Based Protection of Three-Phase Vs PWM Inverter-Fed 

Motors," IEEE Trans. on Power Delivery, vol. 22, no. 4, pp. 

2108-2115, October 2007. 

[40] M. A. S. K. Khan and M. A. Azizur Rahman, "A New Wavelet 

Based Diagnosis and Protection of Faults in Induction Motor 

Drives," Power Electronics Specialists Conference, 2008. PESC 

2008. IEEE, pp.1536-1541, 15-19 June 2008. 

[41] I. Aydin, M. Karakose, E. Akin, "A new method for early fault 

detection and diagnosis of broken rotor bars," Energy Conversion 

and Management, vol. 52, issue 4, pp. 1790-1799, April 2011. 

[42] L. Sun, "Detection of Rotor Bar Breaking Fault in Induction 

Motors Based on Hilbert Modulus Gyration Radius of Filtered 

Stator current Signal," Electrical Machines and Systems, 2008. 

ICEMS 2008. International Conference on, pp. 877-881, 17-20 

Oct. 2008. 

[43] L. A. Periera, D. Fernandes, D. S. Gazzana, F. B. Libano, and S. 

Haffner, "Application of Welch, burg and music methods to the 

detection of rotor cage faults of induction motors," Transmission 

& Distribution Conference and Exposition: Latin America, 2006. 

TDC '06. IEEE/PES, pp. 1-6, 15-18 Aug. 2006. 

[44] B. Xu, S. liu and L. Sun, "A novel method for the early detection 

of broken rotor bars in squirrel cage induction motors," Electrical 

Machines and Systems, 2008. ICEMS 2008. International 

Conference on, pp. 751-754, 17-20 Oct. 2008. 

[45] S. Hamdani, A. Bouzida, O. Touhami, and R. Ibtiouen, 

"Diagnosis of Rotor Fault in Induction Motors Using Music 

Analyses of Terminal Voltage after Switch off,", 18th 

International Conference on Electrical Machines, pp. 1-5, 6-9 

Sept. 2008.0

† 


Experimental Calibration of Numerical Model 

of Thermoelastic Actuator 

*L. Voracek, *V. Kotlan and *B. Ulrych 

*University of West Bohemia, Faculty of Electrical Engineering, Univerzitní 26, 30614, Pilsen, Czech Republic 

Abstract—A numerical model of the thermoelastic actuator for accurate settings of position is compared with the data 

obtained by measurements on an experimental prototype. Some disagreements between the results became the reason for a 

calibration of the model carried out using an appropriate iterative process. 

Index Terms—Actuator, FEM, Measurement, Thermoelasticity. 


Various industrial technologies work with extremely 

small and accurate shifts on the order of 10 –3 to 10 –6 m. 

One way of reaching that small shifts and exact positions 

is using a thermoelastic actuator. Its theoretical 

backgrounds are described in previous papers written by 

our group (i.e., [1], [2]). Recently, we also built a 

prototype of the device and started experimental verifying 

the calculated results. Certain discrepancies between the 

computations and measurements lead to necessity of 

calibrating the material parameters and their temperature 

dependences. 

II. FORMULATION OF THE PROBLEM 

The arrangement of the device is depicted in Fig. 1. 

The dilatation element 2 made of a suitable electrical 

conductive metal is inserted into a coil 3 fixed in frame 4. 

The coil is supplied by harmonic current. The whole 

system is placed in an insulating shell 1. The device is 

clamped by its bottom part 5 in the basement 6 that is 

supposed to be perfectly stiff. The time-variable magnetic 

field generated by the field coil 3 induces in the dilatation 

element 2 eddy currents. These eddy currents produce 

heat and consequent geometrical changes (mainly in its 

longitudinal direction z ) of the dilatation element. 

r 

6 5 4 3 2 1 

Figure 1. The basic arrangement of the device 

1 –shell, 2 – dilatation element, 3 – field coil, 4 – fixing frame, 5 –front, 

6 – stiff wall 

III. MATHEMATICAL MODEL 

Mathematical modelling of the device represents a 

triply coupled problem. Its mathematical model consists 

of three partial differential equations describing the 

distribution of magnetic field, temperature field and field 

of thermoelastic displacements. 

The device does not contain any ferromagnetic part. If 

the field coil 3 (Fig. 1) carries harmonic current of 

z 

density J , the magnetic field in the system may be 

ext 

described by the Helmholtz partial differential equation 

for the phasor A of the magnetic vector potential A in 

the form [3] 

curlcurlA j A J 

. (1) 

Here, symbol denotes the magnetic permeability, is 

the electric conductivity, stands for the angular 

frequency, and J is the phasor of external harmonic 

ext 

current density in the field coil. The conditions along the 

axis of the device and artificial boundary placed at a 

sufficient distance from the system are of the Dirichlet 

type ( A 0 ). 

Heat power in the system is generated by currents in 

the field coil and induced currents in the dilatation 

element. The distribution of the temperature in the system 

can be described by equation [4] 

T 

div grad T cp pJ 

, 

t 

where stands for the thermal conductivity, is the 

specific mass, and c p denotes the specific heat at a 

constant pressure. Finally, the symbol p stands for the 

J 

time average internal sources of heat represented by the 

volumetric Joule losses. These are given by the formula 

2 

ext 

(2) 

J 

p J , J J + J , J = j 

A , (3) 

ext ind ind 

 

where symbol J denotes the current density in the 

ind 

electrically conductive parts of the device. 

The boundary conditions should take into account 

both convection and radiation. But since between the 

dilatation element 2 and field coil 3 there is a ceramic 

tube characterized by a very poor thermal conductivity 

and all parts are placed in a Teflon insulating shell 

(characterized also by a poor thermal conductivity), 

radiation can be – with a practically negligible error – 

disregarded. 

The last considered field is the field of thermoelastic 

displacements. The distribution of displacements in the 

dilatation element 2 follows from the solution of the 

Lamé equation [5]

grad div 

3 2 T grad T 

u u 

f 0 , 

where 0, 0 are coefficients associated with 

material parameters by the relations 

E E 

 

1 1 2 2 1 

 

, . 

Here E is the modulus of elasticity and denotes 

the Poisson coefficient of the contraction. Finally, symbol 

u ur , u, u z represents the vector of the displacement, 

is the coefficient of the linear thermal dilatability of 

T 

the material, and f stands for the vector of the internal 

volumetric forces. These consist (at least in the dilatation 

element 2) of the gravitational and Lorentz volumetric 

forces. But in comparison with the thermoelastic strains 

and stresses they are very small and may be neglected. 

The boundary conditions depend on the particular 

arrangement. In the solved case the displacements of the 

dilatation element 2 at the place of clamping are assumed 

to be equal to zero. 

IV. NUMERICAL SOLUTION 

The numerical solution was performed by a 

combination of professional codes COMSOL 

Multiphysics and Matlab that were supplemented with a 

lot of own procedures and scripts. Special attention was 

paid to the convergence of the results (dependence of the 

distribution of physical fields on the density of 

discretization meshes and in the case of electromagnetic 

field also on the position of the artificial boundary). Some 

part of solution was verified with results obtained from 

code Agros2D. This is a new program from the category 

of FEM based software developed by group at our 

department. This program is very strong for the solution 

of various physical fields but nowadays it is still in a test 

version and some specific parts are not finished yet. 

Therefore it cannot be used for the nonlinear systems. 

V. ILLUSTRATIVE EXAMPLE AND RESULTS 

An illustrative example concerns a prototype of the 

thermoelastic actuator built in our laboratory. Its 

arrangement and dimensions are depicted in Fig. 2. The 

field coil is wound from a copper wire with diameter 

D w 0,63 mm and has approximately 2400 turns. This 

coil is supplied by a harmonic current of frequencies 

519 Hz, 1005 Hz and 2210 Hz, respectively, whose 

RMS values were 0.5 A and 1 A. The dilatation element 

is made of brass UNS C26000, whose nonlinear 

temperature-dependent characteristics of most important 

physical parameters are depicted in next Figs. 3–5. Other 

parts exhibit mainly the function of electrical and thermal 

insulation. All remaining parameters for computation are 

listed in Tab. 1. The entire device is fixed in a firm steel 

construction for its correct functioning. 


(4) 

(5) 

Figure 2: Arrangement of the considered actuator (dimensions in mm): 

1–nylon shell, 2–brass core, 3–copper coil, 4–Teflon fixing part of the 

coil, 5–nylon stand, 6–metallic stand, 7–ceramic tube, 8–nylon cap 

Figure 3: Temperature dependence of thermal expansion coefficient for 

brass UNS C26000 (data from [6]). 

Figure 4: Temperature dependence of the thermal conductivity for brass 

UNS C26000 (data from [6]). 

Figure 5: Temperature dependence of the electrical conductivity for 

brass UNS C26000 (data from [6]).

Table 1: Physical properties of particular materials for the initial step of 

simulation. 

brass Teflon ceramic nylon copper 

rel. permeability [-] 1 1 1 1 1 

rel. permittivity [-] 1 1 1 1 1 

el. conductivity [S/m] 1.5e7 0 0 0 5.7e7 

therm. cond. [W/(mK)] 115 0.24 1.6 0.26 395 

density [kg/m 3 ] 8440 2220 2500 1150 8930 

spec. heat cap [J/(kgK)] 375 1050 1090 1100 313 

Young. modulus [Pa] 9.79e10 

Poisson ratio [-] 0.301 

therm. expans. coeff. [1/K] 18.7e–6 

A. Measurements 

The physical model of the thermoelastic actuator (see 

Fig. 6) was designed by our group at the University of 

West Bohemia. This is the first prototype that can help us 

to validate the results of several projects of devices based 

on the principle of thermoelasticity. 

Figure 6: The thermoelastic actuator – manufactured prototype. 

We also designed and manufactured a measurement 

stand intended for fixing of the device, which 

significantly contributes to the accuracy of 

measurements. The rigidity of the measurement device is 

of extreme importance for measuring such small shifts. 

The measuring stand is supposed to be connected with a 

dynamometer in the future with the aim of making use 

the device for another purpose: the actuator can also act 

as a source of large forces produced by small shifts of the 

dilatation element. 

Figure 7: Arrangement of the measurement. 

The measuring circuit (see Fig. 7) consists of a 

measuring rack, thermoelastic actuator, function 

generator, amplifier, capacitance decade, auxiliary 

resistor and oscilloscope. The measurements of the 

thermoelastic actuator are carried out in several regimes 


characterized by the above frequencies and RMS values 

of the field current. A sinusoidal signal delivered from 

the frequency generator is amplified by the amplifier to 

the desired value of the current at a given frequency. 

A small resistor connected to the thermoelastic 

actuator in series is used for measuring the current 

through the circuit. We oscilloscopically measured the 

voltage at the above resistor and the value of the current 

was determined using the Ohm law from the measured 

voltage and the known resistance. Using of an ammeter is 

inappropriate due to its internal resistance. This would 

increase the total resistive load and the input signal could 

not be amplified sufficiently by the used amplifier. The 

thermoelastic actuator can be considered an RL circuit 

and we obtain the maximum current through it by getting 

it into resonance. This can be achieved by adding a serial 

capacitor. For given values of R and L and prescribed 

frequency it is very easy to calculate its capacitance 

C using the well-known Thomson relation 

2 f 

1 

LC 

. 

(6) 

For compensation we used rolled capacitors whose 

capacitance was determined with respect to the given 

frequency. But we had to respect the available types of 

the capacitors, which did not allow reaching exactly the 

desired values of capacitances from (6). And this explains 

the above values of frequencies that differ from 

“reasonable” values of 500 , 1000 , and 2000 Hz. 

The displacement of the top front of the dilatation 

element was measured using digital indicator MarCator 

1088 (accuracy 0.001mm) in the time interval from 0 to 

300 seconds with increments equal to 30 seconds. In the 

same intervals we measured the temperature inside the 

brass core. The measurements of the internal temperature 

were performed on the coil. The corresponding values 

were plotted into graphs for a comparison with the results 

from the mathematical model. 

The measurements performed on an experimental 

prototype were used for calibration. The measured results 

are shown in Figs. 8 and 9. Figure 10 shows the 

dependence between the temperature and displacement in 

the brass core 2. 

Figure 8: Time dependence of the temperature for the specified 

parameters (current RMS value 1 A and frequencies 519, 1005 and 2210 

Hz) of the field current).

B. Numerical simulation 

For numerical solution we used the previously 

mentioned programs and algorithms based on the FE 

method. At the beginning we created a geometrical model 

according to the technical drawing, which was used for 

manufacturing of the prototype. In the first step of 

simulation we used the material properties according to 

the Tab. 1, whose were found in the base material 

datasheets. 

Figure 9: Time dependence of the displacement for the specified 

parameters (current RMS value 1 A and frequencies 519, 1005 and 2210 

Hz) of the field current. 

Figure 10: Dependence of the displacement on the temperature for the 

specified parameters (current RMS value 1 A and frequencies 519, 1005 

and 2210 Hz) of the field current. 

The time dependence of the temperature and 

displacement (Figs. 11 and 12) show the large 

discrepancy between the measurement and simulation. 

Figure 11: Time dependence of displacement for the first solution – 

current RMS value 1 A and selected frequency 1005 Hz. 


These discrepancies were obviously brought about by 

the temperature-dependent material properties used in the 

numerical simulation (that differed from the real 

properties of those used for building the physical model) 

and also differences between the real geometry of the 

prototype and geometry used for the numerical model. 

Therefore, we checked all dimensions of the model and 

prototype. The next step was usage of some iteration 

processes to found new material properties to get the 

better agreement of the results. Especially we focused on 

the brass, because we did not know its exact designation 

and chemical composition of this material. All materials 

used in the prototype have the relative permeability near 

to one and are not ferromagnetic, therefore all used 

nonlinear characteristics are just the temperature 

dependencies (see Figs. 3, 4 and 5). After several 

corrections we found the satisfying values of material 

properties and comparing them with available database of 

materials [6] we found that the used brass should be UNS 

C26000, whose characteristics are in the mentioned 

figures and were used for the final solution. 

Figure 12: Time dependence of temperature for the first solution – 


The next two figures show an acceptable agreement 

between the measurement and simulation, for the 

frequency f 1005 Hz and RMS value of current 

I 1A. 

in 

Figure 13: Time dependence of displacement for the final solution – 

current RMS value 1 A and selected frequency 1005 Hz.

Figure 14: Time dependency of temperature for the final solution – 


Figures 15 and 16 depict the distribution of the 

temperature and displacement in the dilatation element 

for time t 300 s , frequency f 1005 Hz and current 

I 1A. 

in 

Figures 17 and 18 show the final comparison of the 

measured data with the results of simulation for all three 

frequencies ( 519 Hz , 1005 Hz and 2210 Hz ). 

From the results is visible a small discrepancy, 

especially for the highest frequency. This can be brought 

about by the incorrect temperature-dependent 

characteristic of the thermal conductivity. Therefore, it is 

necessary to perform next steps to find the exact model. 

Figure 15: Graphical presentation of obtained results for total 

displacement of the dilatation element in time t = 300 s, for frequency 

f = 1005 Hz and current Iin = 1 A. 


Figure 16: Graphical presentation of obtained results for temperature in 

the model of the thermoelastic actuator in time t = 300 s, for frequency 

f = 1005 Hz and current Iin = 1 A. 

Figure 17: Comparison of the final solution results of the time 

dependency of displacement with the measured data – for current RMS 

value Iin = 1 A. 

Figure 18: Comparison of the final solution results of the time 

dependency of temperature with the measured data – for current RMS 

value Iin = 1 A.


Thermoelasticity may prove to be a mighty tool in 

some applications where setting of accurate position is 

needed. The process of reaching the required dilatation is 

slow, but reliable. 

Nevertheless, accuracy of the results strongly depends 

on correctness of the input data as is shown in this paper. 

Further research will be, therefore, aimed at possibilities 

of their improvement. At this time we are preparing the 

measurements of material properties of the available 

materials and we want to make the brass spectroscopy to 

find the correct chemical composition to improve the 

model. And, of course, we need to improve the material 

properties of other materials in the model that are not so 

important like the brass, but they can affect the results as 

well. 

Another important aim is to accelerate the heating 

process. New possibilities are investigated in this 

direction, based on using variable amplitude of the field 

current, which can be realized, for example, by pulsewidth 

modulation. 


This work was supported by the University of West 

Bohemia grant system (project No. SGS-2012-039) and 

by the Grant Agency of the Czech Republic (project 

102/11/0498). 

REFERENCES 

[1] I. Dolezel, B. Ulrych and V. Kotlan, “Combined Actuator for 

Accurate Setting of Position Based on Thermoelasticity Produced 

by Induction Heating”, in IEEE Transaction on Industry 

Applications, Vol. 47, No. 5, 2011, ISSN 0093-9994, p. 2250– 

2256. 

[2] Doležel, I., Kotlan, V., Ulrych, B. Electromagnetic-thermoelastic 

actuator for accurate wide-range setting of position. Przeglad 

Elektrotechniczny, 2011, Vol. 87, No. 5, p. 22-27. ISSN: 0033- 

2097 

[3] Kuczmann, M.: Iványi, A.: The Finite Element Method in 

Magnetics. Akademiai Kiado, Budapest, 2008. 

[4] Holman, J.P.: Heat Transfer. McGrawHill, NY, 2002. 

[5] Boley, B., Wiener, J.: Theory of Thermal Stresses. NY, 1960. 

[6] MPDB Database of materials: www.jahm.com. 



Scattering Calculations of Passive UHF-RFID 

Transponders 

*Thomas Bauernfeind, *Gergely Koczka, *Kurt Preis and *Oszkár Bíró 

*Institute for Fundamentals and Theory in Electrical Engineering, Inffeldgasse 18, 8010 Graz, Austria 

E-mail: t.bauernfeind@TUGraz.at 

Abstract—Beside the energy extraction capability from impinging electromagnetic waves provided by the interrogator, the 

signal strength of the field scattered by the transponder is also a quality criterion of UHF-RFID transponder tags. In general, 

the transponder antenna is conjugate complex matched to the nominal transponder IC impedance to achieve maximum 

energy extraction. The reverse link from the transponder to the reader is commonly not taken into account. Due to the 

modulation technique and a voltage limiter circuit at the analog IC frontend, the input impedance of the IC is strongly 

nonlinear. In the present paper a method is proposed which is able to analyze the influence of this nonlinearity on the 

scattered signal by separating the scattered field of the transponder to a reference scattering problem and a pure radiation 

problem. 

Index Terms— Antenna Impedance, Radar Cross Section, UHF-RFID. 


In passive backscattering applications like UHF-RFID 

(ultra high frequency-radio frequency identification) the 

communication between the interrogator unit (reader) and 

the transponder is established by means of modulating the 

radar cross section (RCS) of the transponder. In general, 

for UHF-RFID applications, this is done by switching the 

analog input impedance between two states in phase with 

the data stream to be transmitted, e.g. the EPC (electronic 

product code) value [1], [2]. Unfortunately, the analog IC 

input impedance is not constant, indeed it has a strong 

nonlinear behavior versus applied power e.g. as shown in 

Figure 1. This behavior is mainly caused by a voltage 

limiter at the transponder IC’s frontend and on the power 

consumption of the IC which is determined by the actual 

mode of operation of the transponder [3]. Hence, the 

transponder input impedance is a function of the induced 

antenna voltage [4]. To capture this nonlinear behavior, 

an iterative full-wave simulation of the whole channel 

including the reader antenna, the air volume and the tag 

antenna as proposed in [5] should be applied, taking into 

account the feedback of the tag on the reader. The IC 

behavior is commonly taken into account by means of 

circuital co-simulation [6]. Due to the huge problem 

domain, this technique is unpractical to carry out 

optimization investigations. A possibility to reduce the 

computational costs is to describe the scattered field of 

the nonlinearly terminated tag antenna in terms of a 

reference scattering field and a pure radiated field from 

the excited transponder antenna [7]. Since, in general, the 

effect of the tag on the reader field is small, the feedback 

on the reader is neglected in a first approximation hence, 

the scattered field can be calculated by superimposing 

those fields applying the finite element method. 

II. GENERALIZED SCATTERING MATRIX 

In Figure 2a) a typical UHF-RFID application 

consisting of a transponder tag and a reader antenna is 

shown. Following the approach described in [7], the 

situation at the RFID-transponder tag can be modeled as 

shown in Figure 2b) where the field situation is described 

in terms of spherical wave modes. A mathematical 

description of the simplified transponder model is given 

by the generalized scattering matrix [7], [8]: 

Resistance in Ohm 

100 

90 

80 

70 

60 

50 

40 

30 

20 

10 

0 

b S S S a 

d S c 

d S S c 

00 01 0 N 

1 S 10 ... c 1 . (1) 

N N0NN N 

Real{Z_IC} un-modulated 

Real{Z_IC} modulated 

Imag{Z_IC} un-modulated 

Imag{Z_IC} modulated 

-10 -8 -6 -4 -2 0 2 4 6 8 10 

Pa in dBm 

Figure 1: Nonlinear transponder IC impedance versus applied power 

(NXP Ucode G2X). 

In (1), a relationship between the complex applied (a) 

and reflected (b) waves at the transmission line 

connecting the load impedance to the antenna and the 

incoming (cn) and outgoing (dn) spherical wave mode 

series is given. 

a) b) 

Figure 2: a) Typical RFID application. b) Simplified transponder 

model. 

0 

-50 

-100 

-150 

-200 

-250 

Reactance in Ohm

In case of an antenna driven by external waves 

(c1,… cN), the mode on the feed transmission line reflects 

at the load impedance defined by the load reflection 

coefficient given by 

Z L Z 0 

Z ZL Z 0 Z 0 . (2) 

0 

So the total field can be written as: 

N 

b 1 S 00 S0m c m 

m 1 m 

N 

S0 1 0 , (3) 

t 

dn S nn0 0 b 

N 

Snmc m 

m 1 

c 

1 m 

N 

S 

1 nm S . (4) 

Combining (3) and (4), the total outgoing field dn t can be 

calculated as: 

t 

dn S N 

n 

0 

S 

0 

m c m 

1 S00 00 m 1 N 

S nm nmc c m 

m 

1 

m 

0 

S S0 

. (5) 

In (5), S00 is the antenna port reflection coefficient given 

to be 

Z Ant A t Z 0 

S00 

Z Ant Z0 

0 Z 0 . (6) 

0 

The coefficients S0m are the transmission coefficients 

from the antenna to the transmission line, Sn0 the 

transmission coefficients from the transmission line to the 

antenna and Snm are the mode reflection coefficients 

directly connecting the incoming and outgoing wave 

modes. The first term in (5) can be thought of as only 

load impedance dependent and the second term as 

structure dependent, respectively. 

If one is interested in the scattered field only, one has 

to subtract the field in absence of the antenna (dn = cn) 

from (5) yielding 

N N 

s S N 

N 

n 

0 

dn S 0 

m c cm m c cn n S Snm 

nm c cm 

m 

1 

S00 00 m 1 m 1 

m 

0 

. (7) 

S S0 

Since a short circuit condition can easily be achieved, it is 

reasonable to use the short circuited case as reference. 

With = -1, one can calculate the scattered field in the 

short circuit case too: 

s S N N 

sc n 

0 

dn S S00m 0 

m c cm m c cn n S Snm 

nm c m 

1 S 00 m 1 m 

1 

m 

0 

. (8) 

S S0 

Using (7) and (8) it is now possible to rewrite (7) in terms 

of the reference scattered field from the short circuit 

condition as: 

1 0 

N 

s s 1 S 

sc 

n 0 

dn d dn n S0mcm 1 

S00 1 S00 m 1 

sc s 

N 

S0cm 1 m 0 S0 00 00 m 

S 00 1 S0 

1 

. (9) 

In (9) it is assumed that the incoming spherical wave 

mode series cm is known. Since the finite element method 

should be applied, the description in terms of the wave 

modes is not practical. Introducing the antenna short 

circuit current Isc and describing the radiated field in 

terms of an antenna driving current IAnt as presented in 

[7], one can eliminate the incoming wave mode series cm 

from (9) yielding: 


0 1 1 

s s I 

sc rad sc 

Z 1 S 

rad 0 1 

00 

dn d sc sc 0 1 S 

sc rad 

00 000 

n d n 

n 

1... 

N 

. (10) 

2 I IAnt Z ZAnt 1 

S S00 

000 

Equation (10) is the key expression enabling the 

description of a scattered field in terms of a reference 

scattered field and a pure radiated one. Finally, using (2) 

and (6), (10) can be written as: 

s s ssc 

rad IscZ d L 

n d sc 

n d 

n 

I IAnt ZAnt Z L Z . (11) 

L 

The electric field E is the summation of the spherical 

wave mode series, so E is given by [7], [11]: 

IZ 

E E E 0 L 

Scattered E EShort Short E EAntenna 

Antenna 

. (12) 

IAnt ZAnt Z L 

Since, especially for UHF-RFID applications in 

general, the transponder antenna is conjugate complex 

matched to the nominal transponder IC impedance it is 

advantageous not to use the short circuit case as reference 

but rather the conjugate complex matched case. In [7], [8] 

it is described how to eliminate the short circuit case from 

(12) to get the final relationship: 

* * 

EScattered ( ZL) E Scattered ( Z Ant ) E EAntenn 

Antenna 

. (13) 

* * 

I I 

m I 

I 

m 

Ant 

In (13), Im * is the current at the terminal of the antenna 

for a conjugate complex matched transponder antenna in 

case of a pure scattering problem and * is the conjugate 

matched reflection coefficient: 

* 

* Z ZAnt AAnt t Z L 

Z ZAnt ZL 

L Z . (14) 

L 

Finally, this means that the field scattered by an antenna 

terminated with a certain load impedance ZL can be 

calculated by a superposition of a reference scattering 

field and a scaled radiated field of the antenna driven 

with a current IAnt. 

III. NUMERICAL INVESTIGATIONS 

The basic electromagnetic field problem shown in 

Figure 3 is analyzed with a finite element based in-house 

code. Introducing the magnetic vector potential A and the 

modified scalar potential V the electric field intensity E 

and the magnetic field intensity H in the time harmonic 

case can be written as: 

E j A j V 

, (15) 

1 

H A = A. (16) 

Using n1 edge basis functions Ni for the magnetic vector 

potential A and n2 nodal basis functions Ni for the 

modified electric scalar potential Vh as proposed in [9], 

the Galerkin equations to be solved become: 

N i A h hd j c c N i i A 

h 

h d 

j l 

j c N 

i 

gradV h hd N 

i A 

h hd 

d 0 

Z IC w 

SIBC 

( i 1,2,..., 12 1,2,..., , , , n ) 

N c i 

c i (17) 

1

j c cgradN 

gra adNi 

A h d 

j c gradN gra adNi 

gradV h d 0 (18) 

( i 1, 12 , 2,..., 22,..., 

, , n2). 

In (17) and (18), the approximations of the potential 

functions are given by: 

n 1 

Ah a i N 

i 

i 1 

N 

i 1 i 

, (19) 

n 

2 

Vh VVN i 

N i 

i 

1 i 

. (20) 

The needed truncation of the problem domain has been 

realized by applying perfectly matched layers (PMLs) as 

proposed in [10]. For the first basic scattering 

investigations it was refrained from modeling the reader 

antenna structure e.g. as shown in Figure 2a). Instead, the 

actual scattering problem was excited by means of a 

Hertz-dipole as proposed in [4] since the Hertz-dipole can 

be modeled by a filament current with a given length. The 

main advantage of this excitation technique beside the 

reduction in the degree of freedom is, that the radiated 

power of a Hertz-dipole can be calculated analytically 

[11], which offers the possibility of validating the quality 

of the results gathered within the post processing. On the 

other hand, the feedback of the transponder tag on the 

reader is neglected in this case. Since the influence of the 

transponder on the reader for UHF-RFID applications is 

small in general, it is assumed that this effect is negligible 

in a first approximation [4]. 

The excitation of the pure radiation problem is done by 

impressing a voltage U0 at the feed gap of the dipole 

structure by prescribing a constant vector potential for the 

length y of the feed gap: 

Ee y y j j A y y U 0 . (21) 

As it can be seen from (17), the IC impedance is 

modeled with a surface impedance boundary condition 

(SIBC) as proposed in [12]: 

E Ett1 t 1 dds 

s 

U E 

l 

t 1 l l 

Z l 

t 1 

l 

IC Z 

SIBC , (22) 

I H t 2 d ds s K 

w 

t1 

w w 

t 

c 

where l is the length of the impedance geometry and w is 

the width. The surface impedance ZSIBC in (22) is given 

by the relationship of the tangential component of the 

electric field intensity Et1 and the tangential component 

Ht2 of the magnetic field intensity, assuming constant 

tangential components at the surface impedance. 

Due to the choice of the basis functions, the resulting 

system of equations (17) and (18) becomes singular 

which is not a drawback applying an iterative solver 

method. Unfortunately, the resulting system of equations 

is ill conditioned as described in [13], [14]. Hence, a 

direct solver method [14] has to be applied to avoid 

impractically long simulation durations. Due to the 

singularity of the system of equations, a tree-gauging [15] 

is needed to be able to apply the direct solver method. 


IV. BASIC EXAMPLE 

The proposed method is tested on a very basic example 

shown in Figure 3. Applying a electric boundary 

condition 

E n 0 (23) 

(which is a Dirichlet boundary condition for A) in the x-zplane 

and a magnetic boundary condition 

H n 0 (24) 

(which is a Neumann boundary condition for A) in the yz-plane, 

only a quarter of the problem has to be modeled. 

The half wavelength dipole at a frequency of f = 1 GHz 

with a length of 0.5 l Ant 7.5 cm and a width of 

0.5 w Ant 2.5 mm has a thickness of d = 1 mm and is 

placed at a distance of 25 cm to the Hertz-dipole 

excitation which is 0.8 times the wavelength . Hence, 

far field conditions can be assumed. In Figure 3b) a detail 

of the feed gap with a width of wgap = 200 μm is shown. 

The SIBC is connected via perfect electric conductors 

(PEC) to the excitation and the antenna structure. 

a) b) 

Figure 3: a) Typical RFID application. b) Simplified transponder 

model. 

The input impedance of the dipole antenna structure in 

the present model is calculated to be 

ZAnt = 100.15 + j 44.05 . Hence, the input impedance of 

the fictitious IC must be ZIC = 100.15 – j 44.05 to 

fulfill the conjugate complex matching. With the given IC 

impedance, the needed reference field can be calculated 

by subtracting the field of the Hertz-dipole excitation in 

absence of the dipole structure from the total scattering 

problem as proposed in [4]. The results are shown in 

Figure 4a) to c). 

a) b) c) 

Figure 4: a) Scattering problem. b) Hertz-Dipole excitation. 

c) Scattered field from the dipole structure. 

Next the scattered field in case of the modulated IC 

impedance should be calculated with the proposed 

method. For the modulated IC impedance, it is assumed

that a resistance of Rmod = 150 is parallel to the IC 

impedance which is a typical value for UHF-RFID 

transponder ICs. So the modulated IC impedance is given 

to be ZICmod = 62.76 – j 15.6 

In Figure 5, the result obtained by the proposed method 

is compared with the result gathered by the method of [4]. 

As it can be seen, a good qualitative agreement between 

the two methods is obtained. The current distribution 

along the dipole structure has also been investigated. This 

is done by comparing the magnetic field intensity in the 

vicinity of the antenna structure along the antenna for 

different phase angles in the excitation. The results are 

shown in Figure 6. In Figure 7, the relative error between 

the two methods is shown. As it can be seen, the good 

agreement between the two methods is also quantitative. 

The difference can be explained by uncertainties in the 

determination of the antenna input impedance ZAnt, since 

the scaling factor in (13) is directly related to this 

number. 

a) b) 

Figure 5: a) Scattered field calculated with the method from [4]. 

b) Scattered field calculated with the proposed method. 

|H| in A/m 

1,6 

1,4 

1,2 

1,0 

0,8 

0,6 

0,4 

0,2 

0,0 

0 0,02 0,04 0,06 0,08 

Distance in m 

Figure 6: Current distribution along the dipole antenna in terms of the 

magnetic field intensity. 

8 

6 

4 

2 

-4 

-6 

-8 

-10 

Magnetic field intensity along the dipole antenna 

Scattering 0° 

Proposed method 0° 





Scattering 180 

Proposed method 180 

Relative error of the magnetic field intensity in % 

0 

0 

-2 

0,02 0,04 0,06 0,08 

Relative error 0° 




Distance in m 

Figure 7: Relative error of the current distribution along the antenna. 


V. CONCLUSION 

It has been shown on a very basic example that the 

scattering from passive objects like UHF-RFID 

transponder tags can be described in terms of a reference 

scattering problem and a pure radiation problem. Hence, 

the proposed method offers the possibility of a certain 

reduction in the computational effort, since if multiple IC 

impedances have to be taken into account the whole 

channel including the reader antenna has to be simulated 

for the reference case only. All other IC states can be 

modeled as pure radiation problems without having to 

model the reader structure. 

REFERENCES 

[1] K. V. S. Rao, P. V. Nikitin and S. F. Sander, “Antenna design for 

UHF RFID tags: a review and a practical application,” IEEE 

Trans. on Ant. and Prop., vol. 53, no. 12, pp. 3870-3876, 2005. 

[2] V. Chawla and D. S. Ha, “An overview of passive RFID,” IEEE 

Communications Magazine, vol. 45, no. 9, pp. 11-17, Sept. 2007. 

[3] A. Moretto, E. Colin, C. Ripoll and S. A. Chakra, “Shunt behavior 

in RFID UHF tag according to ISO standard and manufacturer 

requirements,” Proceedings of the IEEE, vol. 98, no. 9, pp. 1550- 

1554, 2010. 

[4] T. Bauernfeind, K. Preis, G. Koczka, S. Maier and O. Biro, 

“Influence of the Non-Linear UHF-RFID IC Impedance on the 

Backscatter Abilities of a T-Match Tag Antenna Design,” IEEE 

Trans. on Magn., vol. 48, no. 2, pp. 755-758, 2012. 

[5] R. Wang and J. Jin, “A Flexible Time-Stepping Scheme for 

Hybrid Field-Circuit Simulation Based on the Extended Time- 

Domain Finite Element Method,” IEEE Trans. on Advanced 

Packaging, vol. 33, no. 4, pp. 769-776, 2010. 

[6] G. Manzi and U. Mühlmann, “Passive UHF RFID sensor / 

transponder antenna optimization for backscatter operation by 

electromagnetic-circuital co-simulation,” Proceedings of the 11 th 

International Conference on Telecommunications, pp. 17-22, 

2011. 

[7] R. C. Hansen, “Relationships Between Antennas as Scatterers and 

as Radiators,” Proceedings of the IEEE, vol. 77, no. 5, pp. 659- 

662, 1989. 

[8] R.G. Green, “Scattering from conjugate-matched antennas,” IEEE 

Trans. on Ant. and Prop., vol. 14, no. 1, pp. 17-21, 1966. 

[9] O. Biro, “Edge element formulations of eddy current problems,” 

Comput. Methods Appl. Mech. Eng., vol. 169, pp. 391-405, 1999. 

[10] I. Bardi, R. Remski, D. Perry and Z. Cendes “Plane wave 

scattering from frequency-selective surfaces by the finite-element 

method,” IEEE Trans. on Magn., vol. 38, no. 2, pp. 641-644, 

2002. 

[11] C. Balanis, Antenna Theory: Analysis and Design, Hoboken: John 

Wiley & Sons, 2005. 

[12] K. Hollaus, O. Biro, K. Preis and C. Stockreiter, “Edge finite 

elements coupled with a circuit for wave problems,” International 

Conference on Electromagnetics in Advanced Applications, pp. 

956-959, Torino, 2007. 

[13] G. Koczka, T. Bauernfeind, K. Preis and O. Biro, “Schur 

Complement Method Using Domain Decomposition for Solving 

Wave Propagation Problems,” The 10th International Workshop 

on Finite Elements for Microwave Engineering, FEM2010, pp. 53, 

Meredith, 2010. 

[14] G. Koczka, T. Bauernfeind, K. Preis and O. Biro, “An Iterative 

Domain Decomposition Method for Solving Wave Propagation 

Problems,” The 11th International Workshop on Finite Elements 

for Microwave Engineering, FEM2012, pp. 66, Estes Park, 2012. 

[15] R. Albanese and G. Rubinacci, “Solution of three dimensional 

eddy current problems by integral and differential methods,” IEEE 

Trans. on Magn., vol. 24, pp. 98-101, 1988.


Simulation of a High Speed Reluctance Machine 

Including Hysteresis and Eddy Current Losses 

B. Schweighofer∗ , H. Wegleiter∗ , M. Recheis∗ , and P. Fulmek † 

∗Graz University of Technology, Institute of Electrical Measurement and Measurement Signal Processing 

Graz, Austria 

† Vienna University of Technology, Institute of Sensor and Actuator Systems, Vienna, Austria 

E-mail: bernhard.schweighofer@TUGraz.at 

Abstract—Flywheel energy storage systems in automotive applications require a compact design, which typically uses the 

rotor of the electrical machine as storage mass. In order to minimise friction losses the rotor is running in vacuum. The 

heat generated in the rotor can only be transferred by thermal radiation and thermal conduction through the bearings, 

eventually. Therefore, a precise estimation of the expected rotor losses is needed to design an efficient thermal management 

of the whole machine. In this paper a switched reluctance machine is analysed by finite–element simulations with focus on 

hysteresis losses and eddy current losses. 

Index Terms—Flywheel, Reluctance Motor, Hysteresis, EM Model 


Flywheel energy storage systems are an important 

area of research in the automotive industry, to satisfy 

the demands for low–emission or zero–emission by 

hybridisation and electrification of vehicles, trucks and 

buses. Typical systems have to be designed for high 

power and medium energy content, e.g. 120 kW and 

1.5 kWh. Advantages of flywheel systems are the high 

energy density in comparison to double–layer capacitors, 

the higher power density in comparison to batteries, the 

State–Of–Charge is always known, almost no degradation 

of performance with age, etc. Additionally several constraints, 

such as size, weight, and costs, have to be fulfilled. 

Typically a compact design, in which the rotor of 

the electrical machine acts as the flywheel storage mass 

is chosen. Permanent magnet (PM) electrical machines 

seem to be the optimum choice for flywheel systems 

due to their high efficiency, high power density, and 

lower rotor losses in comparison to induction machines. 

Especially for automotive applications, however, several 

drawbacks have to be considered: 

– high power rare earth (RE) magnets (samariumcobalt 

or neodymium-iron-boron) are very expensive 

– even moderate temperatures may dramatically degrade 

the RE–magnets performance by demagnetisation 

or even destroy them 

– high mechanical stress due to centrifugal forces on 

the rotor in flywheel machines necessitate additional 

supporting structures to protect the RE–magnets 

– the combination of a RE–magnet–rotor with an 

iron stator leads to significant zero–torque losses, 

limiting the storage capabilities of the flywheel. 

Reluctance machines (RM) generally show a higher mechanical 

robustness. They can be produced from high– 

strength electrical steels to easily withstand the centrifugal 

forces, the high torques, and the pulse accelerations 

during operation in the vehicle. In comparison to PM– 

machines the rotational speed can be increased, leading to 

higher storage densities. Additionally, no electromagnetic 

losses have to be expected for the free running flywheel, 

which means no zero–torque losses. A comparison 

between synchronous (synRM) and switched reluctance 

machine (SRM) shows an advantage for the SRM due 

to its simple coil design and the higher power density. 

For the design of the SRM the rotor losses need special 

attention. As the rotor is operated in vacuum to avoid 

frictional losses, heat dissipation can only happen by 

thermal radiation and, eventually, by thermal conduction 

through conventional mechanical bearings. The stator is 

cooled by conventional water–cooling. For an accurate 

modeling of the heat flow as well as to predict the 

efficiency of the machine the knowledge of the losses 

inside the motor, especially in the rotor is needed. 

This paper deals with the analysis of the rotor losses in 

a switched reluctance machine (SRM). We have chosen 

an external rotor 12/8 SRM with a power of 120 kW and 

a rated speed of 25000 rpm. 

II. METHODOLOGY 

Losses in ferromagnetic materials are divided into the 

static, rate–independent hysteresis loss and dynamic rate– 

dependent electromagnetic losses (eddy–current losses, 

excess losses) [1], [2], [3]. The rate–independent iron– 

losses are determined by the materials’ hysteresis loop. 

If the course of the local vectorial flux–density and 

magnetic field is known, the integral over the respective 

vectorial hysteresis loops gives the hysteresis loss in the 

material. Existing models for ferromagnetic hysteresis 

(e.g. Jiles–Atherton, Preisach, Energetic Model) describe 

scalar BH–loops, only. 

The calculation of rate–dependent losses is usually 

based on the time variation of the flux density obtained 

from static finite element (FE) simulations. With increasing 

frequencies, however, the skin–depth decreases leading 

to an increasing deviation of the real flux density from 

the static simulation results. For the rate dependent losses

several formulations for sinusoidal flux densities have 

been proposed [1], [4], suitable only for linear materials, 

in the frequency domain. The correspondence between 

simulation results and experiments is not satisfying if the 

material is used at high flux densities, at saturation, due 

to the assumption of linear material behaviour. During the 

operation of a high performance electrical machine the 

magnetic fields and the flux densities in the core material 

are neither unidirectional nor sinusoidal. 

Loss calculations taking into account the non–linear 

materials’ properties, the eddy current distribution and 

the skin effect require an exorbitantly high computational 

effort for 3D dynamic FE simulations. 

We have prepared a 2D FE–model of the machine to 

calculate the distribution of the static flux density for 

arbitrary rotor angles and excitation patterns. An external 

source current is used to establish the magnetic field. The 

model is prepared to analyse the influence of different 

current pulse patterns, and to develop efficient control 

strategies. The non–linear ferromagnetic materials’ properties 

have been modelled by the scalar Energetic Model 

(EM) [5], which has been parameterised by Epstein frame 

experiments and from manufacturers data–sheet. The EM 

is used to derive the single valued BH–commutation 

curve used to characterise the material in FEM, and 

to calculate the static hysteresis loss for arbitrary flux 

density waveforms [6], [7]. The variation of the flux 

density with time, obtained by the FE–simulation, is 

used to estimate eddy current losses by approximating 

equations [8]. Finally, an expression for the losses in 

the machine can be found by integrating these local loss 

expressions over the whole volume. 

III. RELUCTANCE MACHINE FE–MODEL 

Fig. 1. Geometry of the reluctance machine RM. Three sets of coils 

produce a rotating magnetic quadrupole field. With the 12/8 ratio the 

principle step angle is 15 Degrees, π/12. The shown position of the 

external rotor is defined as α =0 ◦ , rotation direction chosen is clock– 

wise. Two points are chosen to evaluate the magnetic field, flux density 

and hysteresis loss: Point 1 at the center of a rotor tooth, Point 2 at the 

surrounding yoke area. 

We have chosen a 12/8 SRM (double 6/4 machine [9]) 

with an external rotor. The geometrical dimensions of 


the SRM are described in Table I. With the chosen ratio 

of air–gap–diameter to length of the machine ≈ 12:16, 

the 3rd dimension has been omitted leading to a more 

simple 2D FEM model of the SRM Figure 1. Three sets 

of coils (ABC) build a rotating quadrupole field in the 

machine. With 12 stator teeth and 8 rotor teeth the step 

angle is 15 ◦ . The central steel shaft of the stator is used 

for cooling. 

Both, stator and rotor, are built up from steel sheet. 

The stator is made from a standard electrical steel. For 

the rotor material we have chosen the Vacodur50S high 

strenght cobalt–steel from Vacuumschmelze. 

The rated speed of the SRM is 25000 rpm, the stator 

coils switching frequency results to 10 kHz. 

length of motor 

external rotor 

160.0 mm 

No. of teeth 8 

material Vacodur 50S 

outer diameter 180.0 mm 

inner diameter 121.0 mm 

tooth depth 14.5 mm 

gap width 

stator 

28.0 mm 

No. of teeth 12 

material Armco electrical steel 


inner diameter 60.0 mm 

gap depth 17.4 mm 

tooth width 14.0 mm 

shaft 

material construction steel 


TABLE I 

GEOMETRIC DIMENSIONS OF SRM. 

Fig. 2. FEM: Gmsh/GetDP 2D–model for rotor position α =15 ◦ . 

17516 vertices and 37224 elements in a free triagonal mesh. 

For the FEM simulation we use two different software 

packages: the commercial Comsol–Multiphysics [10] and 

the general open–source packages Gmsh/GetDP [11]. 

Figure 2 shows the final mesh of the model with 37224 

triangle elements. In both FEM–packages we defined the 

magnetic materials’ property as BH–table, calculated by 

the Energetic Model.

IV. MATERIALS MODEL 

The static magnetic materials’ properties are described 

by the Energetic Model (EM) of ferromagnetic hysteresis 

[5]. The EM describes the non–linear, hysteretic behaviour 

of magnetic polarisation and magnetic field in 

the material based on the concept of minimising the 

total energy in a statistical description of the magnetic 

domain structure. Consequently, many physical factors 

influencing the magnetisation process can be included: 

e.g. magnetocrystalline anisotropy, internal demagnetisation, 

anisotropy of magnetostriction, etc. The EM can 

simulate major and minor hysteresis loops, it also simulates 

the effect of slowly evolving closed minor loops, in 

contrast to other models (e.g. Preisach model, wipeout– 

property). The simplified scalar formulation of the EM 

[5], [6] is perfectly prepared for integration into FEM. 

The parameters of the EM are found by evaluating several 

important points of a measured BH–loop (e.g. saturation, 

coercivity, remanence, initial susceptibility). 

The rotor material has to be chosen with respect 

to soft–magnetic properties (low coercivity, high flux 

density, low losses), and mechanical properties, as well. 

A material with optimum properties for high speed 

rotors is the soft–magnetic cobalt–iron alloy Vacodur50 

manufactured by Vacuumschmelze [12]. Due to the Co– 

content it exhibits a very high saturation flux density of 

2.35 T. At a magnetic field strenght of 800 A/m more 

than 2.0 T are reached. The coercivity is in the range 

of 100–200 A/m. Figure 3 shows the BH–loop from the 

Vacodur50S datasheet and the corresponding results of 

EM–simulations. 

Fig. 3. Hysteresis loops of Vacodur50S. Red lines: BH–loop from 

datasheet (Vacuumschmelze), blue line: BH–loop (virgin curve and 

major hysteresis loop) from EM–simulation. 

Vacodur50S Armco 

Js 2.40 2.10 

q 40.12 15.26 

k 314.95 16.31 

Ne 2.09e-5 3.15e-6 

g 11.30 23.41 

h 3.40 3.1e-3 

TABLE II 

EM–PARAMETERS FOR VACODUR50S AND ARMCO 


For our simulations of the stator material we used the 

EM parameterised for soft–magnetic Armco electrical 

steel sheets (GO, FeSi)[13]. This grain oriented FeSi– 

steel exhibits a coercivity as low as 7 A/m, the technical 

saturation is limited to 2.0 T. Epstein measurements 

provided the data necessary to parameterise the EM. 

Table II shows the parameters used for the EM simulations 

of rotor and stator material. The single valued 

BH–function, required for our FEM simulations, has been 

found by EM calculations of the commutation curves 

Fig. 4 and 5. Both FEM packages (Comsol, GetDP) 

use tables of the simulated BH–commutation curve to 

interpolate the required BH–point. 

Fig. 4. Hysteresis loops for Armco electrical steel. EM–simulations 

for several complete symmetrical loops build the commutation curve. 

Fig. 5. Vacodur50S: simulated symmetrical hysteresis loops build the 

commutation curve. 

V. FEM RESULTS 

The above described model setup (geometry and materials) 

was used for magnetostatic FEM simulations with 

current excitation in Comsol and GetDP. Both packages 

gave almost exactly identical results. A series of static 

calculations has been done to simulate the rotating machine 

for various coil currents. Typical results of both 

FEM simulations are depicted in the following figures. 

The excitation current was applied to Coil–set A (see 

Fig. 1), producing a quadrupole magnetic field. The 

angular rotor position has been changed in 1◦ steps, and 

the corresponding flux densities in 2 points of the rotor

and the flux in the coil A stator tooth have been evaluated 

exemplarily to estimate the rotor losses. 

Figure 6 shows the corresponding relative permeability 

values, i.e. the quotient B/(μ0H), for a moderate coil 

current density of 1 A/mm 2 . Different μr–scales are 

used for stator and rotor. As the respective range of flux 

densities Fig. 7 is below 300 mT, the permeability is still 

rising with flux density (cmp. Fig. 4 and 5), accordingly 

the maximum permeability is at the loci of maximum 

flux density. 

Figure 8 shows Comsol results for a coil current 

density of 5 A/mm 2 . The flux density at the partly overlapping 

stator and rotor teeth approaches the materials 

saturation. 

Fig. 6. GetDP simulation: relative magnetic permeabilities μr at coil 

A current density s =1A/mm 2 (I = 112 A), rotor angle α =10 ◦ . 

Different scales are used for rotor (Vacodur50) and stator (Armco). 

Fig. 7. GetDP simulation: flux density B at coil A current density 

s =1A/mm 2 (I = 112 A), rotor angle α =10 ◦ . 

The evaluation of the flux in the stator teeth is shown 

in Fig. 9. The line integral of B over a plane stator tooth 

cross–section, multiplied by the length of the machine, 

gives the total flux in Wb for a quarter of the machine. 

Perfect alignment of stator and rotor teeth at a rotor angle 

of 22.5 ◦ gives the steepest flux versus field curve. As the 

rotor teeth are moved towards the gap between the stator 

teeth, the demagnetising effect of the increasing length of 

the effective air–gap is clearly visible. When the stator 

tooth faces a rotor gap exactly at 0 ◦ the flux vs. field 

characteristic becomes a rather flat, almost straight line. 


Fig. 8. Comsol simulation: flux density B at coil A current density 

s =5A/mm 2 (I = 560 A), rotor angle α =10 ◦ . 

Fig. 9. Flux in stator tooth versus coil current for varying rotor angle. 

Maximum flux for α =22.5 ◦ , stator tooth exactly aligned with rotor 

tooth, minimum demagnetisation. Minimum flux for α =0 ◦ , stator 

tooth exactly between rotor teeth, maximum demagnetisation, almost 

linear dependence. 

VI. LOSSES 

The conventional three–term iron loss model [1], [8] 

contains expressions for hysteresis loss, eddy current loss, 

and excess loss. All these loss–contributions depend on 

the flux density itself and its time derivative. The local 

flux densities are usually estimated by finite element analyses, 

completely neglecting hysteresis and eddy currents, 

sometimes even the single valued non–linear BH curve. 

In a first step FEM calculations approximately determine 

the local flux density, from the flux density the iron losses 

are determined. 

A. Hysteresis loss 

In our work we use the EM hysteresis model described 

above, to calculate the hysteresis loss for any arbitrary 

course of flux densities in the material. Series of simulations 

for constant coil current and varying rotor angle 

are used to identify the course of the flux density at two 

chosen distinct points of the rotor (see Fig. 1). Figure 10 

shows the evolution of the vectorial components of the 

flux density at point 1 (rotor tooth) versus rotor angle 

when coil A is switched on. The symmetry with a change 

in sign at 90 ◦ and the periodicity of 180 ◦ are evident.

Both components of the B–vector, radial and tangential, 

and the total of the flux density are shown. 

Under normal operation each set of coils (ABC) is 

active during a rotation angle of 15 ◦ . Figure 11 shows 

the radial component of the flux density at point 1 (rotor 

tooth) when all coils A–B–C are excited sequentially, 

leading to a periodicity of 60 ◦ for a fixed point on the 

rotor. The total hysteresis loss at point 1 for a complete 

revolution of 360 ◦ is Wh =6·555.5 J/m 3 = 3332.8 J/m 3 

(Fig. 12), 

Fig. 10. Flux density at rotor point 1 (tooth) versus rotor angle, coil A 

activated with 5 A/mm 2 . The thick lines indicate the normal switched– 

on range for coils A. 

Fig. 11. Radial component of the flux density at rotor point 1 (tooth) 

versus rotor angle. Coils A–B–C are excitated sequentially with 5 

A/mm 2 . 

The same procedure, described above for a rotor tooth, 

is applied to point 2 on the yoke part of the rotor (see 

Fig. 1). Figure 13 shows the evolution of the vectorial 

components of the flux density at point 1 (rotor tooth) 

versus rotor angle when coil A is switched on. In the rotor 

yoke area there exists only a tangential component of the 

flux density, the radial component completely vanishes. 

Figure 14 shows the tangential component of the flux 

density at point 2 (rotor yoke) when all coils A–B–C 

are excited sequentially, leading to a periodicity of 60 ◦ 

for a fixed point on the rotor. The total hysteresis loss 

at point 2 for a complete revolution of 360 ◦ is Wh = 

6 · 672.1 J/m 3 = 4032.6 J/m 3 (Fig. 15). The two extra 

minor loops lead to a significant increase of the hysteresis 

loss in the rotor’s yoke area. 


Fig. 12. Simulated BH–loop for a complete period of the magnetisation 

process point 1, covering 60 ◦ rotational angle. 

Fig. 13. Flux density at rotor point 2 (yoke) versus rotor angle, coil A 

activated with 5 A/mm 2 . The thick lines indicate the normal switched– 

on range for coils A. 

Figure 16 shows the hysteresis losses at two points of 

the rotor for a complete rotor revolution in dependence of 

coil current. Although the amplitude of the flux density 

is significantly larger in the rotor tooth than in the rotor 

yoke, the losses in the yoke are dominant due to the 

existence of a pair of extra minor loops. As the yoke 

reaches local saturation, however, the flux density becomes 

distributed more uniformly, and the amplitude of 

the minor loops decreases, accompanied by a decreasing 

hysteresis loss. 

Fig. 14. Tangential component of the flux density at rotor point 2 

(yoke) versus rotor angle. Coils A–B–C are excitated sequentially with 

5 A/mm 2 .

Fig. 15. Simulated BH–loop for a complete period of the magnetisation 

process at point 2, covering 60 ◦ rotational angle. 

Fig. 16. Hysteresis loss for one complete rotor revolution versus coil 

current. The losses are calculated for two points in the rotor (see Fig. 1). 

B. Eddy current loss 

Eddy current losses depend on the rate of change of 

flux density, the electrical conductivity of the material, 

and the geometry. As long as the induced eddy currents 

are small enough to allow the flux density to completely 

penetrate the material, eddy current losses can be locally 

calculated straight forward. This criterion would 

allow a maximum frequency for sinusoidal excitation 

below 1 kHz (Vacodur50: h = 0.35 mm, μr ≈ 4000, 

σ =2.83 · 106 S/m). A modified expression [8] can be 

used to determine eddy current loss and excess loss: 

W ′ e ∼ = κσh2 1 

· 

12 T · 

T 2 dB 

dvdt 

0 dt 

The time derivative dB/dt is defined by our FEM 

results, σ is the electrical conductivity of the rotor material, 

h is the thickness of the rotor steel sheets, κ is the 

modified coefficient for excess loss, T is the time period. 

Figure 17 shows the resulting eddy current losses for 

a complete revolution at 12000 rpm. The modified loss 

coefficient was chosen as κ =1. For higher frequencies 

the skin–effect has to be taken into account additionally. 

VII. CONCLUSIONS 

A 2D finite element model of a switched reluctance 

machine is presented, using the EM–hysteresis model to 


Fig. 17. Eddy current loss per revolution at 12000 rpm versus coil 

current. The losses are calculated for two points in the rotor (see Fig. 1). 

derive the non–linear, hysteretic BH–function. The EM 

is parameterised from BH–loops from experiment (e.g. 

Epstein measurement) or data–sheet information. The 

FEM calculations use only the single–valued commutation 

curve, derived from EM simulations. Hysteresis loss 

is calculated based on the local flux densities resulting 

from 2D–FEM, by numerical integration of the respective 

EM BH–loops. Under the necessary assumption of a 

negligible influence of the skin depth, we can estimate 

eddy current and excess losses in the rotor, as well. 

VIII. ACKNOWLEDGMENT 

Part of this research has been supported by the Austrian 

FFG, project No. 824164. 

REFERENCES 

[1] G. Bertotti. Hysteresis in magnetism. Academic Press, 2008. 

[2] I. D. Mayergoyz. Mathematical Models of Hysteresis. New York, 

Springer, 1991. 

[3] D. C. Jiles, D. L. Atherton. J. Magn. Magn. Mater. vol. 61, pp. 

48–60, 1986. 

[4] D. Lin et.al. IEEE Trans. Magn. 40 (2), pp. 1318–1321, 2004. 

[5] H. Hauser. J. Appl. Phys. 96 (5), pp. 2753–2767, 2004. 

[6] P. Fulmek, P. Haumer, H. Wegleiter, B. Schweighofer. COMPEL 

29 (6), pp. 1504–1513, 2010. 

[7] P. Fulmek, N. Mehboob, P. Haumer, M. Kriegisch, R. Grössinger. 

SMM19, Book of Abstracts, B1–16, 2009. 

[8] K. Yamazaki, N. Fukushima. IEEE Trans. Magn. 46 (8), 3121 – 

3124, 2010. 

[9] T. J. E. Miller. Switched Reluctance Motors and their Control. 

Magna Physics Publishing and Oxford Science Publications 

(1993) 

[10] Comsol Multiphysics, http://www.comsol.com. 

[11] C. Geuzaine, J.–F. Remacle, Gmsh, http://www.geuz.org/gmsh. 

P. Dular, C. Geuzaine, GetDP, http://www.geuz.org/getdp. 

[12] Vacodur50, Soft magnetic Cobalt Iron http://www. 

vacuumschmelze.com/ 

[13] http://www.aksteel.eu/en/1-products/3-electrical-sheet/


An Iterative Domain Decomposition Method for 

Solving Wave Propagation Problems 

*Gergely Koczka, *Thomas Bauernfeind, *Kurt Preis and *Oszkár Bíró 

*Institute for Fundamentals and Theory in Electrical Engineering, Inffeldgasse 18, A-8010 Graz, Austria 

E-mail: gergely.koczka@TUGraz.at 

Abstract—Solving wave propagation problems with FEM results in a huge number of unknowns due to the large air volume to 

be modeled. These equation systems are very ill-conditioned, because of the big material differences, element-size changes and 

due to the fact that the system matrices are indefinite. Common iterative methods (CG, GMRES) exhibit bad convergence due 

to these conditions. The memory requirement is the weakness of the direct methods. The aim of this paper is to present a 

method, which has smaller memory requirement than the direct methods, and converging faster than iterative methods. 

Index Terms—no more than 4 in alphabetical order. 


It is always an open question how to solve huge 

indefinite, ill-conditioned equation systems efficiently. 

Common iterative methods (CG, GMRES) exhibit bad 

convergence due to these conditions [1]. Assembling the 

system matrix of wave propagation problems in frequency 

domain with finite element method (FEM) results in this 

kind of equation systems because of the huge material 

differences and element-size changes. 

Applying direct solver methods to overcome the 

problem of the ill-conditioned system of equations results 

in high memory requirements [2]. The aim of this paper is 

to present a method with smaller memory requirement 

than the direct methods and better convergence quality 

than common iterative methods. 

The memory requirement of the classical LU 

decomposition applied to sparse matrices can be reduced 

with fill-in reduction algorithms: the minimum degree 

algorithm [3] or the nested dissection algorithm [4]. 

These algorithms are implemented in the Intel® Math 

Kernel Library (PARDISO: sparse linear equation system 

solver routine). However, the memory requirement of the 

direct method is higher than first order: between 

2 

Onlog n and 

1.3 

1.4 

O n , typically On , 

 

O n .For 

huge systems a method is required which is capable of 

decreasing it. 

Fig. 1. The geometrical decomposition of the domain 

II. DOMAIN DECOMPOSITION 

With geometrical domain decomposition it is possible 

to decrease the memory requirement of the direct method. 

The problem domain should be subdivided into n 

open disjunctive sub-domains, namely 

 

n 

 

, 

i1 

i 

(1) 

i, j 1,2,.., n : i j . (2) 

i j 

The interfaces between the domains are 

ij : i j , 

n 

(3) 

: . 

(4) 

 

i, j1 

With these notations, the equation system can be 

written in block-form: 

AII AIxI bI , 

AIA 

x 

 

b 

 

 

where the sub-matrix AII corresponds to the domains, 

ij 

(5) 

A to the interfaces between the domains and I A and 

AI to the connections of the sub-domains and the 

interface. 

The Schur-complement equation system of (5) is 

obtained as 

 

A A A A x b A A b . 

 

1 1 

I II I I 

II I 

 

S 

c 

(6) 

To build the Schur-complement matrix S , 

it is 

necessary to have regular subsystems corresponding to 

the domains (the matrix AII has to be regular). 

The memory requirement of the method can be 

estimated as follows: 

Let us assume that the memory requirement of the

direct method is One 

, where ne is the number of 

equations and 1 2. 

If all the sub-domains have the 

same number of unknowns, then 

1 

 

ne 

n nen , 

n 

 

1 

 

i.e. the overall memory requirement will be decreased by 

a multiplier which depends on the number of the subdomains. 

Assembling the Schur complement matrix takes 

a long time. However, for solving the reduced equation 

system, not the full Schur complement matrix is 

necessary. Applying the biconjugate gradient method 

(BiCG) to solve the Schur complement equation system 

will result in an efficient iterative solver (DD-BiCG) for 

solving wave propagation problems. 

III. ANALYSIS OF THE METHOD 

A. Stability 

Since the original equation system is symmetric, the 

matrices corresponding to the sub-domains and the 

interfaces are also symmetric. To improve the 

conditioning of (6), the following form is used: 

T T 

I L A A A L L x L b L 

A A b 

 

where L 

T A LL 1 1 1 1 1 

I II I I 

II I 

 

1T L SL y 1 

L c 

is the Cholesky decomposition of A 

. Using (8) with BiCG without 

preconditioning is theoretically equivalent to the form (6) 

using A as a preconditioner. The practical examples 

show that the symmetric form (8) is more stable and 

therefore converging faster. 

B. Condition number 

The convergence speed of the BiCG depends on the 

condition number of the equation system. 

Since the A,v formulation is used to solve the 

Maxwell’s equations, the resulting linear equation system 

is singular. To formulate a regular system a tree has to be 

eliminated in the discretized domain. Due to the fact that 

the singular system is better conditioned than the regular 

one, it is better to work with the singular system. 

If and only if the original matrix is singular, the Schurcomplement 

matrix is also singular, if the sub-domain 

matrix AII is regular. 

1 

AII : 

1 

 

A AIAII AIv0AII AIvI 0 1 

v A I A 

 

v 

 

0 

.(9) 

I AII AIv 

 

So to let the Schur complement matrix be singular, it is 

enough to not eliminate edges on the interface. 

A huge advantage of this method is that the matrix 

, 


(7) 

(8) 

corresponding to the sub-domains is block-diagonal, and 

its blocks can be inverted in parallel. The speed of one 

iteration-step of the DD-BiCG can be increased by 

choosing the sub-domains with about the same number of 

unknowns and using a multi-core computer. 

IV. NUMERICAL RESULT 

The efficiency of this method is shown on the example 

of a dipole. The dipole has a length of 140 mm, a width of 

1 mm and thickness of 20 μm. There is a 160 μm air gap 

in the middle (see Fig. 3.). An air volume of 250 mm 

radius is modeled around the antenna (see Fig. 2.). The 

air volume is truncated by a first order absorbing 

boundary condition (ABC). 

Fig. 2. The structure of the dipole antenna and the truncation of the air 

volume. 1/8 model. 

The voltage is prescribed in the air gap (1 V, 1.5 GHz). 

Modeling an eighth of the problem, using A,v formulation 

(A is the magnetic vector potential, v is the modified 

electric scalar potential), and second order hexahedral 

finite elements (20 nodes, 36 edges) the resulting problem 

has 1.986.152 edges and 669.398 nodes. 

Fig. 3. The structure of the dipole antenna (yellow) near the air gap and 

the prescribed electric field in the gap (blue). 1/8 model. 

The efficiency of the DD-BiCG method compared with 

the incomplete Cholesky preconditioned Biconjugate 

gradient method (IC-BiCG) is shown in Fig. 6. The 

convergence criterion was the global relative residual to 

become smaller than 10 -7 . 

To demonstrate the efficiency of the method, two 

different decompositions were tested. In the first case the 

domain has been subdivided into 5 sub-domains (see Fig. 

4.), in the second case 8 sub-domains. (Fig. 5.).

Fig. 4. The problem with five sub-domains. 

In the first case the first domain is the antenna, the second 

is a small air volume around the antenna, and the huge air 

volume is subdivided into three parts. 

Fig. 5. The problem with eight sub-domains. 

In the second case, the antenna and a small air volume 

again build the first two sub-domains, but the air volume 

is subdivided into six parts. 

Method 

name 

TABLE I 

Comparison of the methods 

Memory 

Requirement 

Iterations Run 

time (h) 

ICCG 9,0 GB 68.750 77,86 

DDCG 

5 Domains 

DDCG 

8 Domains 

41,8 GB 975 4,41 

31,8 GB 1.036 4,66 

LU 81,0 GB - - 

To solve the problem an “Intel(R) Xeon(R) CPU 2x 

X5570@2.93GHz 8 cores 64 GB RAM” computer was 

used. 


Fig. 6. The best residuum of the methods during the iterations. 

Blue line: DD-BiCG (5 domains) global residuum; 

Red line: DD-BiCG residuum on the interface; 

Blue dotted line: DD-BiCG (8 domains); 

Red dotted line: DD-BiCG (8 domain) residuum on the interface; 

Black line: IC-BiCG residuum in the whole domain. 

V. CONCLUSION 

Applying the domain decomposition method for 

solving huge indefinite equation system iteratively results 

in an efficient method with reduced memory requirement 

compared to direct methods, and accelerates the iteration 

by decreasing the number of iterations. It enables an 

efficient parallelization technique in implementating of 

the algorithm. 

The choice of the sub-domains is very important to 

increase the efficiency of the method. The sub-domains 

should have about the same number of unknowns. 

Increasing the number of domains decreases the memory 

requirement but results in a higher condition number. 

REFERENCES 

[1] O. Nevanlinna, Convergence of iterations for linear equations. 

Birkhauser Verlag AG, Basel, 1993, pp. viii+177 

[2] G. Koczka , T. Bauernfeind, K. Preis and O. Bíró, "Schur 

complement method using domain decomposition for solving 

wave propagation problems," presented at The 10th International 

Workshop on Finite Elements for Microwave Engineering, 

Meredith, New Hampshire United States, Oct. 12-13, 2010. 

[3] J.W.H. Liu. Modification of the Minimum-Degree algorithm by 

multiple elimination. ACM Transactions on Mathematical 

Software, 11(2):141-153, 1985. 

[4] G. Karypis and V. Kumar. A Fast and High Quality Multilevel 

Scheme for Partitioning Irregular Graphs. SIAM Journal on 

Scientific Computing, 20(1):359-392, 1998.


On Effectiveness of Model Order Reduction 

for Computational Electromagnetism 

*Yuki Sato, *Hajime Igarashi 

*Graduate School of Information Science and Technology, Hokkaido University 

Kita 14, Nishi 9, Kita-ku, Sapporo, 060-0814 

E-mail: yukisato@em-si.eng.hokudai.ac.jp 

Abstract— This paper presents the model reduction method based on the method of snapshots for time-domain finite element 

analysis of quasi-static electromagnetic fields. In this method, the snapshots of transient electromagnetic fields for relatively 

short periods are stored to build the variance-covariance matrix, from whose eigenvalues the basis functions for reduced 

analysis are constructed. In this paper, the effect of various parameters in the present method such as the number of 

snapshots, snapshot intervals on the results of the reduced field computations is discussed. 

Index Terms—Model order reduction, finite element method, method of snapshots, eddy current problem. 

applying it to three dimensional eddy current problems. 

Moreover, we discuss a possible method to determine the 

adequate values of the parameters for this method. 


In recent years, finite element method (FEM) has 

widely been applied to transient analysis of quasi-static 

and high-frequency electromagnetic fields. However, 

since FE equations must be solved at each time step, it 

has significant computational burden. Therefore, a lot of 

efforts have been made to reduce the computational times 

for analysis of transient electromagnetic fields. 

One of the most promising methods to shorten the 

computational time would be the time-period explicit 

error correction (TP-EEC) method [1], [2]. It has been 

shown that TP-EEC method applied to non-linear eddy 

current problems reduces the computational time for 

transient analysis and gives correct steady state solutions 

[1], [2]. However, TP-EEC method cannot be applied to 

analysis of non-time-periodic problems or high-frequency 

problems. 

On the other hand, there is yet another method, called 

the model order reduction, which can reduce the 

computational time for transient analysis [3], [4]. In this 

method, the snapshots of transient solution are stored for 

initial short period. Then, using these snapshotted 

solutions, a variance-covariance matrix is constructed and 

the eigenvectors of this matrix is computed. The reduced 

FE matrix is then constructed using the transform matrix 

whose column space is spanned by the dominant 

eigenvectors. There are a few merits in this method; it can 

accurately analyze the transient solutions and can be 

applied to non-time-periodic problems and highfrequency 

problems. However, in order to realize accurate 

analysis, it is important to determine adequate values of 

the parameters in this method such as snapshot interval 

and period, and number of the basis vectors that are 

chosen from the eigenvectors of the variance-covariance 

matrix. However, the dependence of the accuracy on 

these parameters has not been clarified. Moreover, though 

the effectiveness of the model reduction for twodimensional 

eddy current problems has been discussed 

[5], that for three dimensional problems has not been 

shown yet. 

In our study, the dependence of the accuracy in the 

model reduction method on its parameters is evaluated by 

II. REDUCTION TECHNIQUE 

A. Time-Domain Finite Element Method 

The A-φ (A-V) method is used for FE analysis of 

quasi-static electromagnetic analysis. The governing 

equations derived from Maxwell’s equations is expressed 

as 

A 

 

 

rot rotA 

grad J , (1) 

t 

t 

 

A 

 

div grad 0 , 

(2) 

t 

t 

 

where ν is magnetic resistivity, is conductivity and J is 

forced current density. The vector potential A and scholar 

potential φ are discretized as follows 

e 

 

A a N , 

(3) 

j 

n 

 

j 

j 

, (4) 

j 

j j N 

where e and n is the number of edges and nodes, and Nj 

and Nj are vector and scholar interpolation functions 

respectively. The weighted residual method with Galerkin 

method applied to (1) and (2) results in the FE equation 

given by 

K 0a 

d N 

S a 

b 

 

, 

t 

0 0 

 

 

d 

 

S M 

 

 

0 

(5) 

 

t 

 

where 

K rotN 

rotN 

dV 

, 

(6) 

ij 

 

V 

i 

j 

Nij 

N i N jdV 

, 

(7) 

V 

Sij 

N i grad N jdV 

, 

(8) 

V 

 

M grad N grad N dV 

, (9) 

ij 

V 

i 

j

i 

N i JdV. 

V 

(10) 

Moreover, time derivative is approximated by the finite 

difference and the unknown variables and right hand 

vector are interpolated as 

k 

x x 

k1 

( 1 

) x , 

(11) 

k 

b b 

k1 

( 1 

) b , 

(12) 

where x = [a φ] t , 0 ≤ θ ≤ 1 and k represents time steps. 

Equation (5) now becomes 

1 N 

 

t 

t 

 

S 

S K 

M 

 

 

0 

0 

k 

0 

 

 

 

x 

 

1 N 

 

t 

t 

 

S 

S K 

( 1 

) 

M 

 

0 

k 

k 1 

0 

k 1 

b ( 1 

) b 

 

, 

0 

 

x 

 

0 

(13) 

where Δt is the time step interval. The transient solutions 

can be obtained by solving (13) at each time step. 

B. Model Reduction 

As mentioned above, solution of (13) at each time step 

is computationally expensive if the number of unknowns 

is large. To reduce the computational time, the reduced 

equation is obtained from (13) using the model reduction 

method. To do so, after obtaining snapshots for the initial 

periods by solving (13), the variance-covariance matrix 

Cm 

t 

m XX C (14) 

is constructed where 

1 

2 

s 

X [ 

x μ x μ x μ] 

, (15) 

x i , i=1,2,..,s are snapshotted solution vectors, s is the 

number of snapshots (m>>s) and μ is the mean vector of 

these solutions. Note that the matrix Cm is a dense matrix 

whose size is the same as that of the FE matrix. 

Therefore, numerical solution to the eigenvalue problem 

for (14) is computationally prohibitive. In order to 

alleviate this problem, we consider the smaller matrix of 

s×s defined by 

t 

Cs 

X X 

(16) 

instead of Cm. The eigenvalues of Cm and Cs are identical 

except m-s zero eigenvalues of Cm as shown below. The 

singular value decomposition of matrix XR m×s is given 

by 

X 

s 

 

i1 

t 

t 

u v UV 

, 

(17) 

i 

i 

i 

where UR m×s , VR s×s and 

diag[ 1 2 s ] , (18) 

σ1 ≥σ2 ≥ ... ≥σs ≥ 0 and σi is singular value of X. The 

matrices U and V satisfy 

t 

U I , 

(19) 

U s 

t 

V V Is 

, 

(20) 

where Is denotes the s×s unit matrix. Then, the matrices 

Cm and Cs can be decomposed as follows: 

C 

2 t 

U 

U , 

(21) 

m 


2 t 

Cs V 

V . (22) 

From eqs. (21) and (22), we find that the eigenvalues of 

Cm and Cs are essentially identical. Moreover, since 

X vi iu 

(23) 

i 

holds, the eigenvectors of Cm can be easily obtained by 

solving the eigenvalue problem for Cs. 

Then, the dominant r eigenvectors are chosen to 

construct the matrix defined by 

W [ 1 2 

r ] . w w w 

(24) 

It is assumed that the original unknown variable x k R m 

can be expressed by the linear combination of the reduced 

variables y k R r in the form 

W . 

k 

k 

x y 

(25) 

Using the transform (25), the original FE equation (13), 

which is simply expressed by Ax=b, can be reduced to 

t n t n 

W AWy 

W b . 

(26) 

Since the size of the coefficient matrix W t AW is r×r 

(m>>s>r), eq. (26) can be solved much faster than (13). 

III. ANALYSIS OF BULK CONDUCTOR MODEL 

The bulk conductor model shown in Fig. 1 is analyzed 

by using the present method. The FE model has 125000 

nodes, 117649 elements, 367500 edges and 369036 

unknown variables. The conductivity and relative 

permeability in the magnetic material are 0.510 7 S/m 

and 1000. The driving frequency is set to 50 Hz and time 

step is ∆t=10 -4 sec. Under these conditions, as the time 

constant τ is estimated to be about 0.01 sec and the period 

T is 0.02 sec, the relation T>τ holds. 

A. Dependence on snapshot interval and period 

We change the snapshot intervals and the period during 

which the snapshots are taken to clarify the dependence of 

the solution on them. In this study, the solution to eq. (13) 

is snapshotted from 0 to T with snapshot intervals 4∆t, 

2∆t and ∆t. The snapshot period is T, T/2 and T/4. 

Moreover, the number of the basis functions is set as r=40 

and 50. 

The time variation in the magnetic flux density |B| at 

the center of magnetic material is shown in Fig. 2. In this 

problem, we set the number of basis function as r=40. In 

Fig. 2(a), we find that there are no significant differences 

between the original solution and the solutions obtained 

by the conventional method and model reduction method 

with different snapshot intervals. Also, Fig. 2(b) plots the 

time changes in |B| for initial period, 0

35 

20 

15 

y(mm) 

magnetic 

material 

J(A/m 2 ) 

TABLE I 

DEPENDENCE OF COMPUTATIONAL TIME AND ERROR ON SNAPSHOT 

INTERVAL AND PERIOD 

(A) SNAPSHOT INTERVAL 

Snapshot intervals Δt 2Δt 4Δt 

computational time (%) 43.2 37.3 29.6 

error e(%) 0.07 0.19 0.65 

Snapshot period 

(B) SNAPSHOT PERIOD 

T T/2 T/4 


error e(%) 0.02 0.32 0.60 

TABLE II 

DEPENDENCE OF COMPUTATIONAL TIME AND ERROR ON NUMBER OF 

BASIS FUNCTION 

Number of basis function 20 30 40 


error (%) 2.38 0.29 0.07 

TABLE III 

DEPENDENCE OF COMPUTATIONAL TIME AND ERROR ON NUMBER OF 

BASIS FUNCTION LONGER TIME CONSTANT 

Number of basis function 40 60 80 


error (%) 0.39 0.15 0.07 

conductor, shown in Fig. 4, obtained by the conventional 

method and present method in which the snapshot period 

is set to T/4. The discrepancy can also be found in the 

initial responses shown in Fig. 3(b). This suggests that the 

long range errors can be predicted from the initial 

responses. That is, by performing the analysis using the 

present method for initial short period changing the 

snapshot period, we could know the appropriate value for 

it. 

The error between the original solution and that 

obtained by the present method is defined by 

e 

H i 

i 

 

i 

red 

H 

H 

i 

i 

z(mm) 

x(mm) 

15 20 35 

15 

coil 

Figure1 : Bulk conductor model 

100(%) 


(27) 

where Hi red is the magnetic field obtained by the present 

method and Hi is computed from the original solution. 

Table I and II summarize the error e evaluated at t=0.08 

sec where the solutions sufficiently converge to steady 

state and corresponding computational time. We can see 

that the errors e become small as the snapshot interval 

decreases or the snapshot period increases. However, the 

35 

5 

x 

magnetic 

material 

20 

coil 

35 

x(mm) 

Magnetic Density (T) 


2.00E-04 

1.50E-04 

1.00E-04 

5.00E-05 

0.00E+00 

Original solution without reduction 

T/4 

T/2 

T 

-5.00E-05 

0 0.02 

Time (s) 

0.04 0.06 

(a) |B| during 0

(a) Original solution. 

(b) Solution obtained by present method. 

Figure 4 : Eddy current distribution. 

computational time simultaneously increases. This means 

that we must determine these parameters considering both 

effects. 

B. Dependence on number of basis functions 

In this section, we discuss the dependence of the 

solutions obtained by the present method on r, the number 

of the basis functions. The snapshot period and interval 

are set to T and Δt, respectively. The analysis results are 

shown in Fig. 5. It can be found both in Fig. 5 (a) and (b) 

that the solutions obtained by the present results approach 

the original solution as r increases. 

The computational time and error e depending on r are 

summarized in Table II. We can see that e decreases as r 

increases while there are little dependence of the 

computational time on r. 

C. Bulk conductor with longer time constant 

To test the validity of the present method for systems 

with longer time constants, we increase the conductivity 




2.00E-04 

1.50E-04 

1.00E-04 

5.00E-05 

0.00E+00 

-5.00E-05 

0 0.01 0.02 0.03 0.04 0.05 

(a) |B| during 0

(a) Original solution 

(b) Primal basis vector w1 

(c) Second basis vector w2 

(d) Fifth basis vector w5 

Figure 7 : Distribution of original solution and basis vector. 


35 

20 

15 


2.50E-04 

2.00E-04 

1.50E-04 

1.00E-04 

5.00E-05 

Original solution without reduction 

Δt 

0.00E+00 

-5.00E-05 

2Δt 

4Δt 

0 0.01 0.02 

Time (s) 

0.03 0.04 0.05 

(a) |B| during 0

sec. The snapshot period is set to T/2, T/4 and T/8. The 

snapshot interval and the number of basis vectors are set 

to Δt and 40, respectively. 

The time change in |B| at the center of the model is 

shown in Fig. 9, where (a) and (b) show the relatively 

long-range and initial responses, respectively. Due to the 

structure of the stacked iron core, the time constant of this 

system is much smaller than that of the bulk iron shown in 

Fig. 1. We can see that in Fig. 9 that the solutions are in 

good agreement with the original solution except during 

the initial short period. 

V. CONCLUSION 

In this paper, the three dimensional time-domain FE 

analysis using the model order reduction based on the 

method of snapshots has been presented. Effectiveness of 

this present method is shown for bulk conductor and 

stacked iron model. It has been found that the snapshot 

period and number of basis functions have great influence 

on the transient solutions obtained by the present method. 

It has been suggested that these parameters could be 

appropriately determined by performing time marching 

for initial some steps for the different parameter values. 

In future, we plan to apply the present method to nonlinear 

eddy current problems and high-frequency 

problems. 

REFERENCES 

[1] Y. Takahashi, T. Tokumasu, M. Fujita, S. Wakao, T. Iwashita, 

and M.Kanazawa, “Improvement of convergence characteristic in 

nonlinear transient eddy-current analyses using the error 

correction of time integration based on the time-periodic FEM and 

the EEC method,” (in Japanese) IEEJ Trans. PE, vol. 129, no. 6, 

2009. 

[2] H. Igarashi, Y. Watanabe and Y. Ito, ”Why Error Correction 

Methods Realize Fast Computations,” IEEE Trans. Magn., vol. 

48, no. 2, pp.415-418, 2012. 

[3] Krysl, P., S. Lall, and J. Marsden, “Dimensional Model Reduction 

in Non-linear Finite Element Dynamics of Solids and Structures,” 

International Journal for Numerical Methods in Engineering, 

vol. 51, pp479-504, 2001. 

[4] G. Kerschen J. Golinval, AF. Vakakis, LA. Bergman, The 

method of proper orthogonal decomposition for dynamical 

characterization and order reduction of mechanical systems: an 

overview, Nonlinear Dynamics, vol. 41, pp. 147 169, 2005. 

[5] S. Rutenkroger, B. Deken, S. Pekarek, Reduction of Model 

Dimension in Nonlinear Finite Element Approximations of 

Electromagnetic, Computers in Power Electronics, 2004, 

Proceedings. IEEE Workshop on, pp. 20-27, Aug., 2004. 

[6] P. Holmes, JL. Lumley, G. Berkooz, Turbulence; Coherent 

Structures; Dynamical Systems and Symmetry. Cambridge 

University Press: Cambridge, 1996. 



Calculation of eddy-current probe signal for a 

3D defect using global series expansion 

Sándor Bilicz, József Pávó and Szabolcs Gyimóthy 

Budapest University of Technology and Economics 

Department of Broadband Infocommunications and Electromagnetic Theory 

Goldmann Gy. tér 3.,1111 Budapest, Hungary 

E-mail: bilicz@evt.bme.hu 

Abstract—A novel eddy-current modeling technique of volumetric defects embedded in conducting plates is presented in 

the paper. This problem is of great interest in electromagnetic nondestructive evaluation (ENDE) and has already been 

exhaustively studied. The defect is modeled by a volumetric current dipole density which satisfies an integral equation. 

The latter is solved by the classical method of moments. This have been usually based on the volume discretisation of 

the defect. Contrarily, –as a new contribution– we propose the use of globally defined, continuous basis functions for the 

expansion of the current dipole density. This global expansion lets us expect for an improvement of the numerical stability 

and the performance of the simulation. The proposed method is tested against both measured and synthetic data obtained 

by a different defect model. 

Index Terms—eddy-current modeling, integral equation, global expansion, moment method 


Eddy-current nondestructive testing (ECT) is a widely 

used technique to reveal and characterize in-material 

flaws (inclusions, voids, cracks, etc.) within conducting 

specimens. The principle of ECT is based on the local 

changes in the specimen’s electromagnetic (EM) parameters 

due to the flaw. These changes result in an EM field 

different from the field in the flawless case. Either the 

field directly, or a deduced quantity (e.g., impedance of 

a probe coil) is measured during a nondestructive test and 

the acquired data are used for the flaw reconstruction. 

The inverse problem of nondestructive testing can be 

ill-posed. This means that any of the existence, unicity 

and stability of the solution is not necessarrily provided. 

Beyond these theoretical challenges, the flaw characterization 

can be numerically demanding as well: the 

inversion algorithms are often iterative, i.e., several flaws 

are to be sequentially simulated in an optimisation loop. 

Consequently, a key element of the inversion is a fast and 

reliable numerical simulation of flaws. 

Classical attempts of flaw modeling are the integral 

approaches. They can cope with the difficulties arisen by 

the relatively small size of flaws compared to the excited 

region (yielding discretisation issues). The classical work 

[1] presents a flaw simulation where the flawed volume is 

discretised by a regular grid. The yielded volume integral 

eqution is resolved by the Method of Moments (MoM) 

[2], assuming a piecewise constant approximation of the 

EM field over each cell of the grid. By now, this method 

has been implemented in commercial softwares, e.g., [3], 

and has been successfully applied in inversion algorithms 

as well [4]. The volume integral method has recently 

been revisited in [5], where the EM field is expanded 

by means of locally defined splines. This provides the 

smoothness of the field, which is violated in the previous 

approach. Variational formalisms have also been tried 

with success: in [6], a Finite Element Method (FEM) 

scheme is presented for the separated computation of the 

field in the flawless specimen and the “reaction field” 

risen by the presence of the flaw. Coupled methods 

have been introduced, e.g., in [7]: a FEM code for the 

computation of the flawless field is coupled with a surface 

integral scheme of the ideally thin crack model. 

In [8], an ideally thin crack is considered which is 

modeled by a surface integral equation, again resolved 

by MoM, using a piecewise constant approximation. 

Though some of the works above were carried out 

decades ago, several pitfalls are still present in eddycurrent 

flaw modeling. Today’s challenges are mainly 

related to the increasing needs of flaw inversion in the 

sense of accuracy and speed. Beyond being small, flaws 

can have bad aspect ratio as well, making the volumetric 

models fail. It is also not straightforward how to choose 

between the volumetric and the ideally thin crack models 

for an arbitrary defect. The optimisation-based inversion 

schemes can badly perform if the sensitivity data are 

inaccurate. Another important issue is the convergence of 

the simulation with respect to the discretisation applied. 

In case of a grid-discretisation, this can only be controled 

at the price of computational load. 

These challenges inspired the improvement of the 

MoM-based discretisation techniques of the integral 

equation models. The above cited formalisms are resolved 

by using locally defined basis functions for the 

expansion of the EM field. A new approach has been 

presented in [9], where the basis functions were globally 

defined (i.e., all over the surface of the ideal crack) 

harmonic functions. The use of such global expansion

provided considerable advantages over local expansions. 

In this paper, we present the use of global expansion 

functions for volumetric flaw modeling. In a certain 

sense, this is an extension of the method formalized for 

the ideally thin flaws in [9]. 

II. THE VOLUME INTEGRAL METHOD 

Let us consider a non-magnetic, conducting specimen 

(to be tested against material flaws) with a homogeneous 

conductivity σ0. A time-harmonic source (typically, a 

coil) near the specimen induces eddy-currents within the 

conductive medium. In the presence of a flaw embedded 

in the volume region V , the otherwise constant conductiviy 

of the specimen will locally change: σ = σ(r), 

r ∈ V , so the EM field will change, too. The EM field 

can be decomposed into a so-called incident term and a 

defect term : 

E(r) =E i (r)+E d (r), (1) 

where only Ei (r) would exist in the flawless (σ(r) ≡ σ0) 

specimen, whereas Ed (r) rises due to the flaw. The latter 

is imagined as the field corresponding to a fictious source 

distribution which takes place in the flawless specimen 

and has exactly the same effect as imposed by the flaw 

[1]. Formally, let the secondary source be a current dipole 

density P =(σ(r) − σ0)E, r ∈ V .ThenEd (r) can be 

expressed as 

E d 

(r) =−jωμ0 G(r|r ′ )P(r ′ )dV ′ , (2) 

V 

where G(r|r ′ ) is the electric-electric dyadic Green’s 

function transforming the current density excitation at 

the point r ′ to the generated electric field at the point 

r. ω is the angular frequency of the source and μ0 is 

the vacuum permeability. By substituting (2) into (1), 

using the definition of P, we get a Fredholm-type integral 

equation of the second kind for the unknown current 

dipole density: 

 

1 

P(r)+jωμ0 

σ(r) − σ0 

G(r|r 

V 

′ )P(r ′ )dV ′ = 

= E i (r). 

Once the integral equation is solved, P(r) can be used 

to derive quantites that can be measured during the 

nondestructive test. In the illustrative cases that we will 

present in this paper, a probe coil is used for both the 

excitation of the field and the acquisition of the measured 

data via its complex impedance variation ΔZ. As a 

consequence of the reciprocity principle [10], ΔZ can 

be computed as 

ΔZ = − 1 

I2 

E i (r) · P(r)dV, (4) 

V 

with I being the amplitude of the probe coil’s current. 

This decomposition (1) let E i (r) and G(r|r ′ ) be 

separately computed, which provides the well-known 

advantages from the viewpoint of numerical evaluation. 

Moreover, the formula (4) can also be easily evaluated. 


(3) 

Let us also highlight that the volume integral method 

bears the potential pitfall of properly computing the 

Green’s function. Due to its singularity, a numerically 

stable expression of G(r|r ′ ) often requires special efforts, 

as it will be shown in Subsection III-C. 

III. SOLUTION OF THE INTEGRAL EQUATION 

A. The studied configuration 

We restrict our studies to a special, but practically 

important configuration, outlined in Fig. 1. The specimen 

is assumed to be a homogeneous conducting plate with 

a finite thickness. The dimensions of the plate in the x 

and y directions are assumed to be infinite. The flaw 

is of cuboid shape and it has four edges perpendicular 

to the plate surface. The flaw edges are A, B and D, 

respectively, and the volume V is defined as 

|x| ≤ A B 

, |y| ≤ and |z − C| ≤ 

2 2 

D 

, (5) 

2 

where C is the center of the crack along z. The conductivity 

within the flaw volume V is known, σ(r), typically, 

σ(r) =0. 

r 1 

r2 

Coil 

x 

c 

Coil 

Plate 

z=C 

y 

c 

A 

y 

z 

B 

z=0 l 

A 

z=−d 

D 

Plate 

Flaw 

TOP VIEW 

d 

h 

SIDE VIEW 

Figure 1. The studied configuration. An air-cored pancake-type coil 

scans above the infinite plate near the flaw. For generality, a burried 

flaw is sketched, however, we deal with ID and OD flaws. 

The probe coil is actually an air-cored pancake-type 

probe, driven by a time-harmonic current. During the 

nondestructive test, the coil scans above the damaged 

zone and its impedance variation is measured at given 

coil positions. 

B. Global approximation of the current dipole density 

Let us approximate the solution P(r) of the integral 

equation (3) by means of a finite series. Let the basis 

x 

x

functions of this expansion be products of three factors, 

each depending only on one Cartesian coordinate: 

P(r) = 

M 

N 

Q 

m=−M n=−N q=−Q 

Pmnqf m x (x)f n y (y)f q z (z), 

(6) 

The key idea in this paper is the choice of the basis 

functions: in contrary with the classical schemes, herein 

each basis function is globally defined, i.e., all over the 

flaw volume V . Let us note that the special restrictions 

for the shape of the flaw are needed here. We propose 

the use of the following harmonic factors in the basis 

functions: 

f m 

1 

x (x) = 

A exp 

 

2πj mx 

 

, 

A 

f n 

1 

y (y) = 

B exp 

 

2πj ny 

 

, 

B 

f q 

1 

z (z) = 

D exp 

(7) 

 

 

q(z − C) 

2πj . 

D 

In fact, this leads to a three-dimensional complex Fourierseries; 

the integers m, n and q are the harmonic orders. 

The basis functions form an orthonormal set with respect 

to the scalar product: 

 

 

g(r) ,h(r) := g(r)h ⋆ (r)dV. (8) 

Let us notice that our choice for the basis functions 

provides a smooth approximation of P, instead of the 

piecewise constant approximation discussed in [1]. In 

Section IV, the advantages provided by the global expansion 

are discussed along with numerical examples. 

C. Discretisation by the Method of Moments; computation 

of the matrix elements 

The R =(2M +1)(2N+1)(2Q+1) unknown vectorial 

coefficients Pmnq in the series (6) are determined by 

means of the Method of Moments. The testing functions 

are the same as the basis functions (Galerkin-method) and 

a linear system of R vectorial equations is obtained. For a 

handy formalization, let the basis functions be ordered so 

as each triplet of harmonic orders (m, n, q) has a unique 

index k (1 ≥ k ≥ R) and denote the kth basis function 

as 

wk(x, y, z) =f m x (x)f n y (y)f q z (z). (9) 

The elements in the system matrix of the linear equations 

are in the form 

a λκ 

lk =(eλ · eκ) wl(r) ,wk(r)/(σ − σ0) + 

 

 

jωμ0 wl(r) , eλ · G(r|r 

V 

′ )(eκwk(r ′ ))dV ′ , 

(10) 

where l and k are the indices of the test and basis 

functions (l, k =1, 2,...,R), eλ and eκ are the unitvectors 

(λ, κ = x, y, z) and , stands for the scalar 

product. 

V 


The evaluation of the integral with respect to r ′ dV ′ 

needs special numerical treatment due to the singularity 

of the Green’s function. However, in the case of the 

considered planar geometry and of the proposed basis 

functions, one can cope with the singular kernel by 

means of the spectral method, presented in detail in 

[11]. In brief, by using the 2-dimensional spatial Fouriertransform 

in the xy plane, the spectrum of the Green’s 

function can be represented as a sum of planar waves 

traveling along z. Due to the product-separation form (7) 

of the bais functions, the integral with respect to r ′ dV ′ 

in (10) splits up to a factor depending only on z and z ′ 

and to an other factor which is represented by its 2D 

Fourier-transform. The integral with respect to z ′ can 

be analytically evaluated in the spectral domain, and the 

remaining factor (to be inverse transformed) is no longer 

singular. 

As a useful consequence of the Galerkin-method, 

certain elements of the yielded system matrix must be 

the same. In (10), the volume integral with respect to 

rdV (due to the scalar product) and to r ′ dV ′ can be 

commuted. According to the reciprocity theorem, we 

have 

a λκ 

lk ≡ a κλ 

k ′ l ′, (11) 

for all k and l if wk(r) ≡ wk ′(r)⋆ and wl(r) ≡ wl ′(r)⋆ 

hold, respectively. This equivalence can be applied to 

check the numerical computations and/or to reduce the 

computational load. 

Finally, let us notice that the presence of the probe coil 

is neglected in the expression of the Green’s function. 

This is usual and does not cause considerable error. 

D. Computation of the incident field 

The Ei (r) incident field can be analytically computed 

in the studied case. The pancake-type coil generates an 

axisymmetric field which depends only on z and r = 

(x − xc) 2 +(y − yc) 2 . This field can be expressed in 

the form of an integral of first-order Bessel-functions, as 

detailed in the classical work [12]. 

More complicated probes (e.g., including ferrit core or 

having rectangular-shaped turns) can also be considered. 

However, in such cases, Ei (r) is obviously more difficult 

to compute (e.g., by a Finite Element Method). 

Once the incident field is obtained within the flaw 

volume V , the excitation vector of the linear system of 

equations yielded by the MoM can be assembled from 

the entries 

b λ l = wl(r) , eλ · E i (r) , (12) 

where l =1, 2,...,Ris the index of testing function and 

eλ is the unit vector (λ = x, y, z). As a consequence of 

the axial symmetry, bz l ≡ 0 holds for all l. 

E. Implementation issues 

The algorithms are coded in Matlab R○ . The spectral 

domain expression of the Green’s function is inverse 

transformed by a 2-dimensional Fast Fourer Transform

(FFT2) routine. The width of the FFT2’s spatial window 

in the xy plane is estimated from the skin depth within 

the conductive medium, whereas the spectral window is 

assigned with respect to the harmonic orders m and n, 

respectively. 

The integrals involved by the scalar products in (10) 

and (12) are evaluated numerically, based on a regular 

discretisation of the flaw volume. The number of samples 

along each axis is set with respect to the harmonic orders 

m, n and q of the basis and testing functions, respectively. 

IV. TEST CASES AND COMPARISONS 

In this section, the proposed method is illustrated 

and its main advantages are highlighted via numerical 

examples. 

A. Definition of the configurations 

The illustrative test cases are presented in Fig. 1. The 

air-filled rectangular flaw has constant zero conductivity. 

Though a buried flaw is outlined in the sketch, we present 

cases for ID-type (“inner defect”, opening to the top 

surface of the plate: C = −D/2) and OD-type (“outer 

defect”, C = −d + D/2) flaws only. 

Experimental data of the variation of the coil’s 

impedance (ΔZ) are available on the xc = 0 line in 

function of yc, at discrete coil positions. The cases #1 

and #2 are JSAEM Benchmarks [13], whereas case #3 is 

an also frequently cited TEAM Benchmark no. 15 [14]. 

The parameters of each case are given in Tab. I. 

Table I 

NUMERICAL DESCRIPTION OF THE TEST CASES.(NOTATION IS 

ACCORDING TO FIG.1.) 

#1 #2 #3 

Specimen 

d (mm) 1.25 1.25 12.22 

σ0 (MS/m) 1 1 30.6 

Flaw 

A (mm) 0.21 0.21 0.28 

B (mm) 10 10 12.6 

D (mm) 0.75 0.5 5 

C (mm) −d + D/2 −D/2 −D/2 

Probe coil 

r1 (mm) 0.6 0.6 6.15 

r2 (mm) 1.6 1.6 12.4 

h (mm) 0.8 0.8 6.15 

l (mm) 0.5 1.0 0.88 

f (kHz) 150 300 0.9 

Turns 140 140 3790 

B. Convergence of the series 

One of the main advantages of the global expansion 

method is the easy access to the convergence with respect 

to the maximal harmonic orders of M, N and Q in the 

series of P. By adding further terms of higher orders, 

the previously computed elements of the system matrix 

remain unchanged. Consequently, convergence studies 

can be performed at a much lower computational cost 

than in the case of local basis functions (the latter needs 

a new grid whenever a finer discretisation is set). 


We have computed ΔZ in the test case #1 using different 

MNQ maximal orders. The discrepancy between 

two impedance signals is expressed by the norm 

 

ΔZ := (1/K)ΣK k=1 |ΔZ(yc,k)| 2 , (13) 

where the number of coil positions is actually K =11 

and yc,k =(k−1) mm. In Fig. 2, the normalised discrepancy 

between the first impedance signal (MNQ = 121) 

and some others obtained by higher order approximations 

is shown. A fast convergence is experienced: e.g., there 

is ca. 5% discrepancy between the impedance signals 

computed with MNQ = 121 and with the higher orders 

MNQ = 163. Let us note that the variation of the current 

dipole density in the x-direction is smooth enough to be 

modeled by a first order Fourier series, i.e., the choice 

M =1seems to be appropriate. 

||ΔZ − ΔZ 121 || / ||ΔZ 121 || 

0.08 

0.07 

0.06 

0.05 

0.04 

0.03 

0.02 

0.01 

0 

121 

131 

151161 

141 

122 

132 

152162 

142 

123 

133 

153163 

143 

Figure 2. Normalised discrepancy between impedance signals obtained 

by different maximal harmonic orders MNQ (marked above each bar) 

in test problem #1. 

The fast convergence is also reasoned by the behavior 

of the coefficients Pmnq. Again in the test case #1, we 

examined the coefficients of the x-directed current dipole 

density P x (r) (note that P x is much more dominant than 

P y and P z in this case) for a centered coil location (xc = 

yc =0). Some coefficients of the largest magnitude are 

plotted in Fig. 3. A fast decrease of the magnitudes is 

experienced as the harmonic orders increase. 

C. Comparison to exparimental data and to the ideally 

thin crack model 

The volumetric flaw model using global expansion 

is a sort of extension of the model proposed in [9]. 

Therein, ideally thin cracks are considered and modeled 

by a surface layer of current dipole density. This can be 

imagined as if the A edge length of the crack (Fig. 1) 

would collapse to zero whereas the x component of 

the total electric field E vanishes on the crack surface. 

For the surface current dipole density, certain boundary 

conditions must hold, thus, the basis functions in the

x 

Normalized |P | 

mnq 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

0 

0 1 0 

0 −1 0 

1 1 0 

−1 1 0 

1 −1 0 

−1 −1 0 

0 −2 0 

0 2 0 

1 −2 0 

−1 −2 0 

1 2 0 

−1 2 0 

0 3 0 

0 −3 0 

−1 1 1 

−1 −1 1 

1 1 1 

1 −1 1 

−1 1 −1 

−1 −1 −1 

1 1 −1 

1 −1 −1 

1 1 2 

−1 1 2 

1 −1 2 

−1 −1 2 

1 1 3 

1 −1 3 

−1 1 3 

−1 −1 3 

1 1 −2 

−1 1 −2 

1 −1 −2 

−1 −1 −2 

Figure 3. Normalized magnitudes of the coefficients of the x-directed 

current dipole density P x (r) in test problem #1. The 34 highest magnitudes 

are plotted (totally there are (2M + 1)(2N + 1)(2Q + 1) = 273 

coefficients, as M =1, N =6and Q =3is chosen); the triplets 

mnq (any of the indices can be negative as it can be seen) are marked 

above each bar. 

series expansion –e.g., for ID cracks– are products of 

the following sine and cosine functions: 

g n 

2 

y (y) = 

B sin 

 

y + B/2 

nπ , 

B 

g q 

2 

z (z) = 

D cos 

 

(2q − 1)π z 

(14) 

 

, 

2D 

with the integers n and q, as “harmonic orders”. Herein 

we do not deal with the surface model in detail, but 

only present some results for comparison, provided by 

the authors of [9]. 

In Figs. 4, 5 and 6, comparisons of impedances (i) 

computed by our volumetric model, (ii) by the surface 

model and (iii) measured data (provided with the benchmarks) 

are presented. The comparisons let us conclude 

the followings: 

• The volumetric model can appropriately reconstruct 

the measured data at very low maximal harmonic 

orders. Let us note again that for instance, when 

MNQ = 122, we have only 75 (vectorial) unknowns 

in the series (6). 

• Though there is no straightforward connection between 

the harmonic orders of the volumetric and 

surface models, in the presented cases, the volumetric 

model provides better results in the sense of |ΔZ| 

than the surface model with more-or-less the same 

harmonic orders. In Figs. 4 and 5, the surface model 

appears to be unstable at NQ =44. (However, the 

surface model also has good convergence properties 

but at considerably higher N and Q which is not 

presented herein.) 

• The volumetric model tends to slightly overestimate 

|ΔZ| and underestimate the phase arg{ΔZ}. 

Whereas the phase error is acceptable in Figs. 4 and 

5, it becomes considerable in Fig. 6. 


|ΔZ| (mΩ) 

arg{ΔZ} (rad) 

80 

60 

40 

20 

0 

1.5 

1 

0.5 

0 

JSAEM OD−60 Benchmark 150 kHz 

measured 

vol 011 

vol 122 

surf 44 

surf 66 

surf 88 

0 2 4 6 8 10 

Coil position y (mm) 

c 

Figure 4. Impedance variation in the test case #1. Legend: “vol” 

and “surf” refer to the volumetric and the surface model. The maximal 

harmonic orders MNQ and NQ used in the simulations are also given. 

|ΔZ| (mΩ) 


150 

100 

50 

0 

3 

2 

1 

JSAEM ID−40 Benchmark 300 kHz 

measured 

vol 011 

vol 122 

surf 44 

surf 66 

surf 88 

0 

0 2 4 6 8 10 


c 

Figure 5. Impedance variation in the test case #2. Legend notations 

are explained in Fig. 4. 

The computation times are quite short. The assemblation 

of the system matrix took, e.g., 285 s for the OD60 

flaw and 202 s for the ID40 flaw when a basis function 

set with maximal orders M =1, N =4, Q =8was 

used for both. Though the number of samples within the 

flaws for the numerical integration is the same in both 

cases, the OD60 flaw computation needs a wider spatial 

window for the FFT2 as the lower frequency yields a 

higher skin-depth. 

V. CONCLUSION AND PERSPECTIVES 

The classical integral equation models of volumetric 

flaws have been used for decades in the simulation 

of ECT. Though these schemes have many advantages, 

several bottlenecks are still present. In this paper, we 

proposed a new discretisation technique for the numerical 

solution of the volume integral equation. Instead of the 

locally defined, pulse basis functions, we use globally 

defined, harmonic basis functions for the expansion of 

the unknown current dipole density distribution. Thanks

|ΔZ| (Ω) 


20 

15 

10 

5 

0 

2 

1.5 

1 

0.5 

TEAM Benchmark 

measured 

vol 011 

vol 122 

surf 22 

surf 44 

surf 66 

0 5 10 15 20 


c 

Figure 6. Impedance variation in the test case #3. Legend notations 

are explained in Fig. 4. 

to this choice, an improvement in the accuracy and 

performance of the simulation has been experienced in 

the test cases. The results obtained so far are promising, 

the research certainly needs to be continued, with special 

emphasis on the followings: 

• More parametric studies are needed for a various 

range of flaw sizes and frequencies to confirm that 

the new scheme outperforms the existing ones, and 

at the same time, to point out its limitations. We 

have not considered yet, for instance, through-plate 

flaws. 

• The method could easily be extended to the case 

of flaws embedded in thick plates (modeled as halfspace). 

The extension must be possible to the case 

of layered medium as well, which might be of more 

practical interest. 

 

• As the aspect ratio of the flaw (e.g., width length) 

• 

gets worse, the volumetric model is expected to become 

less accurate and the ideally thin crack model 

should be applied instead. However, the relation between 

the two models has not been exactly revealed 

yet. One expects the results of the volumetric model 

to converge to the results of the surface model as the 

width of the flaw collapses. This should be studied 

both in theoretical and in numerical senses as well. 

The inverse problem is often formalised as an 

optimisation task of minimizing the discrepancy 

between the measured and simulated data. The 

gradient-based schemes require the sensitivity data 

with respect to the parameters of the flaw. This 

sensitivity is accessible, e.g., via the the adjoint 

problem [15]. However, the numerical stability of 

the gradient computation strongly depends on the 

precision of the EM field calculation near the boundaries 

of the flaw. The proposed global expansion 

of P could improve the precision in these cruical 

regions. 


The authors think that the contribution of this paper 

can be of industrial interest as well, if the further numerical 

studies remain convincing about its performance. 

VI. ACKNOWLEDGEMENTS 

This research is supported by the Hungarian Science 

Research Fund (OTKA grant no. K105996). 

REFERENCES 

[1] J. R. Bowler, S. A. Jenkins, L. D. Sabbagh, and H. A. Sabbagh, 

“Eddy-current probe impedance due to a volumetric flaw,” Journal 

of Applied Physics, vol. 70, no. 3, pp. 1107 –1114, 1991. 

[2] R. F. Harrington, Field computation by moment methods. 

Macmillan, 1968. 

[3] CIVA. “CIVA: State of the art simulation platform for NDE”. 

[Online]. Available: http://www-civa.cea.fr 

[4] S. Bilicz, E. Vazquez, M. Lambert, S. Gyimóthy, and J. Pávó, 

“Characterization of a 3D defect using the expected improvement 

algorithm,” COMPEL: The International Journal for Computation 

and Mathematics in Electrical and Electronic Engineering, 

vol. 28, no. 4, pp. 851–864, 2009. 

[5] C. Reboud, D. Prémel, D. Lesselier, and B. Bisiaux, “New discretisation 

scheme based on splines for volume integral method: 

Application to eddy current testing of tubes,” COMPEL: The 

International Journal for Computation and Mathematics in Electrical 

and Electronic Engineering, vol. 27, no. 1, pp. 288–297, 

2008. 

[6] Z. Badics, Y. Matsumoto, K. Aoki, F. Nakayasu, M. Uesaka, and 

K. Miya, “Accurate probe-response calculation in eddy current 

NDE by finite element method,” Journal of Nondestructive 

Evaluation, vol. 14, pp. 181–192, 1995. [Online]. Available: 

http://dx.doi.org/10.1007/BF00730888 

[7] Y. Le Bihan, J. Pavo, M. Bensetti, and C. Marchand, “Computational 

environment for the fast calculation of ect probe signal by 

field decomposition,” Magnetics, IEEE Transactions on, vol. 42, 

no. 4, pp. 1411 –1414, 2006. 

[8] J. R. Bowler, “Eddy-current interaction with an ideal crack. I. The 

forward problem,” Journal of Applied Physics, vol. 75, no. 12, pp. 

8128–8137, 1994. 

[9] J. Pávó and D. Lesselier, “Calculation of eddy current testing 

probe signal with global approximation,” IEEE Transactions on 


[10] R. F. Harrington, Time-harmonic electromagnetic fields. 

McGraw-Hill, 1961. 

[11] J. Pávó and K. Miya, “Reconstruction of crack shape by optimization 

using eddy current field measurement,” IEEE Transactions on 


[12] C. V. Dodd and W. E. Deeds, “Analytical solutions to eddy-current 

probe-coil problems,” Journal of Applied Physics, vol. 39, no. 6, 

pp. 2829–2838, 1968. 

[13] T. Takagi, M. Uesaka, and K. Miya, “Electromagnetic NDE 

research activities in JSAEM,” in Electromagnetic Nondestructive 

Evaluation, ser. Studies in Applied Electromagnetics and Mechanics, 

T. Takagi, J. R. Bowler, and Y. Yoshida, Eds. IOS Press, 

1997, vol. 1, pp. 9–16. 

[14] T.E.A.M. Benchmark Problems. Accessed on 7.08.2012. [Online]. 

Available: http://www.compumag.org/jsite/team.html 

[15] S. J. Norton and J. R. Bowler, “Theory of eddy current inversion,” 

Journal of Applied Physics, vol. 73, no. 2, pp. 501–512, 1993.


Computation of the Motion of Conducting Bodies 

Using the Eddy-Current Integral Equation 

*Mihai Maricaru, † Ioan R. Ciric, *Horia Gavrila, *George-Marian Vasilescu and *Florea I. Hantila 

*Department of Electrical Engineering, Politehnica University of Bucharest, Spl. Independentei 313, 

Bucharest, 060042, Romania, E-mail: mihai.maricaru@upb.ro 

† Department of Electrical and Computer Engineering, The University of Manitoba, Winnipeg, MB R3T 5V6, Canada 

Abstract—The analysis of the motion of a system of solid conductors in the presence of magnetic fields is performed by 

solving the classical mechanics equation of motion under the action of magnetic forces. Application of the eddy-current 

integral equation and the usage of the local coordinates attached to the bodies in motion allow the determination of 

electromagnetic field without being necessary to reconstruct the discretization grid at each new position of the conducting 

bodies. Only the submatrices associated with the coupling between the bodies in relative motion are modified in the global 

system matrix. A time-domain method of solution is first presented for the electromagnetic field problem, coupled with the 

equation of motion, which can be efficiently applied at high frequencies when the time steps are small. The eddy-current 

integral equation for the derivative of current density contains a term that takes into account the relative motion of the 

bodies. Since the electromagnetic quantities vary much more rapidly than the mechanical quantities, a second method is also 

proposed in this paper, where the eddy-current integral equation is solved in the frequency domain by assuming that the 

bodies are motionless, but by adding supplementary terms due to the actual motion of the bodies. Thus, only the average 

force over a period of time is now computed. This method is extremely efficient especially at higher frequencies when the 

time steps are very small. 

Index Terms—eddy-current integral equation, electrodynamics of moving conductors, levitation. 


The equation of translational motion of a solid 

conducting body of mass m under the action of the 

magnetic force F is 

2 

d r dr 

m F( 

r, 

, ) 

G 

(1) 

2 

dt dt 

where r is the position vector of a point of the body, for 

instance of its center of gravity, is a vector 

representing the imposed current distribution and G is the 

gravitational force acting on the body. Equation (1) is 

discretized in time and F is determined at each time step 

by solving an electromagnetic field problem in the region 

with moving bodies. The application of the Finite 

Element Method requires a tremendous amount of 

computation since it is necessary to reconstruct the 

discretization mesh at each time step. Moreover, the 

modifications of the discretization mesh are, usually, 

accompanied by undesired cumulative errors in the 

successive solutions of the electromagnetic field. A 

substantial improvement can be achieved when adopting 

hybrid Finite Element – Boundary Element Methods [1]. 

Using the “laboratory” frame of references complicates 

the field problem solution due to the presence of the 

motional electric field intensity v0 B , where v 0 is the 

body velocity in this frame of references and B the 

magnetic induction. This disadvantage is eliminated 

when employing local frames of reference, attached to 

the bodies in motion [1], [2]. This also allows the usage 

of the simpler eddy-current integral equation for the 

bodies at rest, as it has been done in the case the 

velocities of the bodies are known [2]. However, in many 

situations the velocities of the bodies are not known, as, 

for instance, in the case of the electromagnetic levitation, 

their determination constituting one of the objectives of 

the present work. 

In the case of the electromagnetic levitation, to ensure 

the stability of the solution it is necessary to choose a 

sufficiently small time step. Since for same accuracy of 

the results the time period has to be divided practically in 

the same number of intervals (for example, at 50 Hz in 

200 intervals [1]), at higher frequencies the time step 

decreases. Unfortunately, as the time step decreases, the 

successively computed solutions tend to be very close to 

each other and the errors in the solution differences 

increase considerably, the computation procedure 

becoming inefficient. 

In the present paper, a new procedure is described for 

the time-domain solution of the eddy-current integral 

equation applicable to small time steps. As well, a 

technique is proposed for accelerating the determination 

of the trajectory of the moving bodies, based on the 

frequency-domain solution of the eddy-current integral 

equation. 

II. TIME-DOMAIN SOLUTION OF THE EDDY-CURRENT 

INTEGRAL EQUATION 

For two-dimensional field problems, the time-domain 

eddy-current integral equation for motionless conductors 

is 

 

d 

1 

J ( r, t) 

J ( r', 

t) 

ln dS' 

dt R 

 

d 

1 

 

Ji 

(r', 

t) 

ln dS' 

(2) 

dt 

R 

i 

where r and r ' are the position vectors of the observation 

point and of the source point, respectively, is the

egion containing the solid conductors, i is the region 

where the imposed current density J i is confined, 

0 

R | 

r r'| 

, , 0 being the permeability of free 

2 

space. 

In the three-dimensional case, the eddy-current integral 

equation has the form 

d J( 

r', 

t) 

J( 

r, 

t) 

dV ' 

2 dt R 

 

d Ji 

( r', 

t) 

dV ' grad 

2 dt R 

 

i 

where is the electric scalar potential. 

To simplify the formulation, we consider here a 

two-dimensional structure with a single solid conductor. 

Using a frame of reference attached to the conducting 

body in motion, the time discretization of (3) leads to 

t 

1 

( J J0 

) ( J1 

J0 

) ln dS ' 

2 

R 

 

1 


(3) 

1 

1 

J0t Ji 

ln dS' 

ln ' 

1 Ji 

dS (4) 

0 

R 

R 

 

i1 

where the subscript “0” indicates the time t and the 

subscript “1” the time t t 

. Dividing (4) by t yields 

 

i0 

J 

t 

J 

1 

ln dS' 

J 

t 

1 2 t 

1 R 

2 

 

2 

Ji 

1 

1 

ln ' ( ) 1 dS 

ln ' 

1 n v 

Ji 

dl (5) 

t 

R 

2 R 

i 

1 

2 

2 

i 

1 t 

where the subscript “ ” refers to the time t 

2 

2 

 

, i 

is the boundary of i , v is the velocity of the i in the 

frame of references attached to , and n is the outward 

unit vector normal to i 

. The last term in (5) is due to 

the relative motion of and i . Solution of (5) gives 

the current distribution J 1 at the time step t t 

in 

terms of that at the time step t in the form 

1 

2 

J 

 

J1 t 

J 

1 

0 

t 

 

2 

. (6) 

The magnetic force is evaluated by applying Ampère’s 

force formula, i.e., 

r ri 

F 

J 

( r, 

t) 

Ji 

( ri 

, t) 

dSi 

dS (7) 

2 

| r r | 

 

i 

i 

0 

with r and r i being the position vectors of the points of 

and i , respectively. 

The spatial discretization grid in the two-dimensional 

case is constructed by dividing the region into 

polygonal surface elements m , with the induced current 

density considered to be constant through each m . The 

region i is divided into surface elements i , with the 

k 

imposed current density being constant through each 

i . Integrating (5) over each 

k 

m yields the following 

J 

 

matrix equation for the vector : 

t 

t 

 

J 

 

J 

A B AJ 

i 

 

0 Bi 

C i 

2 

t 

 

t 

 

1 

2 

1 

2 

1 

2 

J1 2 

where A is a diagonal matrix with entries Am mSm 

, 

m being the resistivity of the material for m and S m 

its area, and B is a symmetric matrix with its entries 

corresponding to the elements m of having the form 

B 

 

 

1 

ln dS 

dS 

S mS 

m, 

k 

k m = k 

R 

mk 

4 

1 2 

 

 

m k 

(8) 

1 

( n m nk 

) R ln dlkdlm 

. (9) 

R 

The entries of the matrix B i are defined as in (9), but 

with the elements k of being replaced with the 

elements i belonging to 

k 

i , while the entries of the 

matrix C are 

C 

m, 

i 

k 

 

1 

2 

 

 

m ik 

1 

( n m R)( 

ni 

v) 

ln dl 

k 

i dl 

k m . (10) 

R 

All integrals in (9) and (10) are evaluated by analytic 

expressions, the entries of the matrix B being calculated 

only once, but those of the matrices B i and C are to be 

calculated for each new position of i . 

Taking into account the small dimensions of the 

elements m , a rapid numerical computation of the force 

in (7) is performed using the approximation 

where the vector 

Jm 

F Sm Pi 

(11) 

k 

m k 

P i is expressed in the form 

k 

1 

P i n 

k 

i ln dl 

k 

i (12) 

k 

| rm 

ri 

| 

Jik 

ik 

k

with r m being the position vector of the center of the 

element m of and r i the position vector of the 

k 

point of integration on i 

. When the ratio of the linear 

k 

dimensions of i to the distance between its center and 

k 

the center of m is sufficiently small, P i can be 

k 

calculated by subdividing i in a number of elements 

k 

p in terms of this ratio and by using the summation 

k 

rm 

rpk 

P i J 

k ik 

S p (13) 

2 k 

p | rm 

rp 

| 

k 

where r p is the position vector of the center of p 

k 

k and 

S p the area of the element p 

k 

k . The same technique is 

applied for a rapid numerical calculation of the entries in 

the matrices B i and C in (8), making also use of the 

relation 

1 R 

( n i v) 

ln dl' 

v 

k dS' 

. (14) 

R 

2 

 

R 

i 

i 

k 

In the case the imposed currents are periodic, the initial 

distribution of the induced current can be obtained by 

performing a Fourier expansion and by employing the 

phasor form of the eddy-current integral equation (see 

Section IV). 

III. SOLUTION OF EQUATION OF MOTION 

Equation (1) is solved iteratively. We choose an 

appropiate time step t and assume that the magnetic 

force F has a linear variation during t . At the time t the 

body has a position defined by the vector r 0 and a 

magnetic force F 0 is exerted upon it. The iterative 

process is started by imposing the value F1 F0 

at the 

time t t 

and the position vector r 1 results from 

solving (1). The electromagnetic field problem is then 

solved for the new r 1 and a new value of the force F 1 is 

determined for the time t t 

. This operation is repeated 

until the difference between two successive values of the 

magnetic force for the time t t 

is sufficiently small 

and, then, we proceed to the next time step. 

IV. FREQUENCY-DOMAIN SOLUTION OF THE 

EDDY-CURRENT INTEGRAL EQUATION 

Since the region i is moving with the velocity v in 

the frame of reference attached to , (2) is written in the 

form 

J 

( r', 

t) 

1 Ji 

( r', 

t) 

1 

J ( r, t) 

ln dS' 

 

ln dS' 

t R 

t 

R 

 

i 

k 

1 

( n v) 

Ji ( r', 

t) 

ln dl' 

. (15) 

R 

i 

k 


If the imposed currents are sinusoidal, the phasor 

representation of (15) is 

1 

1 

J ( ) j J ( ') 

ln dS' 

j J i ( ') 

ln dS' 

R 

R 

 

 

r 

r r 

 

1 

( n v) 

J i ( r') 

ln dl' 

(16) 

R 

 

i 

where 2 

f 

, f being the frequency, and 

re im 

J J jJ 

is the phasor form of the current density, 

with j 1 

. The two terms on the right side of (16) 

show the contribution to the induced current density due 

to the time variation of the imposed currents and that due 

to the relative motion. The same technique as in the case 

of the time-domain analysis is used for the space 

discretization of (16). One obtains the following 

algebraic system with complex coefficients: 

re im im re 

AJ BJ Bi 

Ji 

CJi 

im re re im 

AJ BJ Bi 

Ji 

CJi 

. (17) 

The average magnetic force over a period is evaluated 

using the relation 

 

 

* r ri 

 

Fav 

ReJ 

( r) 

J i ( ri 

) dSi 

dS 

2 (18) 

 

 

| r ri 

| 

i 

 

where the asterisk indicates the complex conjugate. 

For a multiple-conductor systems, one uses local 

frames of reference attached to each of the conductors. 

For the conductor q, occupying the region , 

q 1, 

2, 

, 

(16) is written in the form 

i 

q 

1 

1 

J ( r) j 

J ( r') 

ln dS' 

j 

J ( r') 

ln dS' 

R 

R 

 

pq 

 

q 

( q) 

1 

1 

( n v p ) J ( r') 

ln dl' 

j 

J i ( r' 

) ln dS' 

R 

R 

pq 

 

p 

i 

( q) 

1 

( n 

vi 

) J i ( r') 

ln dl' 

(19) 

R 

 

i 

(q) 

(q) 

where v p and v i are, respectively, the velocities of 

the conductor p and of i with respect to the conductor 

q. Equation (1) is always solved separately for each body. 

V. SOLUTION ACCELERATION FOR THE EQUATION OF 

MOTION 

The computation of the motion of conducting bodies 

can be spectacularly accelerated by using the average 

value of the force over a period, evaluated using the 

p

Figure 1: Discretization of the levitated plate. 

y (m) 

0.06 

0.05 

0.04 

0.03 

0.02 

0.01 

0.00 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 

t (s) 

Figure 2: Evolution in time of the coordinate y of the plate for 

f = 2,000 Hz. 

y (m) 

0.06 

0.05 

0.04 

0.03 

0.02 

0.01 

0.00 

0.0 0.1 0.2 0.3 0.4 0.5 

t (s) 

Figure 3: Detail regarding the motion at the beginning for f = 2,000 Hz. 

phasor representation of current density. Using the 

algorithm described in Section III, the time step is chosen 

to be a multiple of the period and is adjusted according to 

the force value, such that when the force decreases the 

time step is increased and when the force increases it is 

reduced. 

VI. ILLUSTRATIVE EXAMPLE 

A copper plate of width 80 mm, thickness 4 mm (see 

8 

Fig. 1), resistivity 210 

m 

and of mass density 

3 3 

8. 9 

10 kg / m is levitated using two coils of 200 turns 

each, of 10 mm 10 mm in cross section and a distance 

between the axes of their sides of 70 mm and 30 mm, 

respectively. The current direction is the same in the 

outer and inner coils, the current intensity in each turn 

being i I 2 sin 2ft 

, with I = 10 A and f = 2,000 Hz. 


y 

y (m) 

0.050 

0.045 

0.040 

0.035 

0.030 

0.025 

0.020 

0.015 

0.010 

0.005 

0.000 

0 1 2 3 4 5 6 7 

t (s) 


f = 2,000 Hz, with a direct current of 10 A added in the outer coil. 

y (m) 

0.045 

0.040 

0.035 

0.030 

0.025 

0.020 

0.015 

0.010 

0.005 

0.000 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 

t (s) 


f = 200 Hz. 

Initially, the conducting plate is located at 10 mm 

above the coils. It is assumed that the plate only moves in 

the vertical direction, but the procedures described in the 

paper are also applicable when more degrees of freedom 

are considered. The plate cross section is discretized in 

180 rectangular elements, as indicated in Fig. 1. 

For the time-domain method presented in this paper, 

the period was divided in 48 intervals and the motion of 

the plate was observed during 26,000 periods, i.e. for 

1,248,000 time steps. The result is presented in Fig. 2, 

with the detailed motion at the beginning shown in Fig. 3. 

The computation took about 6 hours employing a 2.128 

GHz Intel processor notebook. A great reduction in the 

amount of computation is obtained by approximating the 

conducting region with a thin strip of thickness equal to 

the field depth of penetration [3]. The oscillations of the 

plate can be attenuated by adding a dc component in the 

current coils or by using a permanent magnet. If a direct 

current of 10 A is added in the outer coil, the plate 

motion becomes as it is shown in Fig. 4. 

For a frequency of 200 Hz, the motion of the plate is 

shown in Fig. 5, the attenuation of the mechanical 

oscillations being much stronger than for a frequency of 

2,000 Hz. 

It should be remarked that the same results in Figs. 2 

and 3 were obtained by using the proposed 

frequency-domain procedure (see Sections IV and V). 

Only 4,393 variable time steps, of magnitude between a 

period and 50 periods, were necessary for determining 

the motion of the plate between t = 0 and t = 13 s. The

equired computation time was only 124 s, i.e., about 170 

times less than for the time-domain solution. 

VII. CONCLUSION AND REMARKS 

Two efficient methods are presented for computing the 

motion of the solid conductors under the action of 

electromagnetic forces. Practically, for the same accuracy 

of the results, a tremendous reduction in the amount of 

computation is achieved when using a frequency-domain 

procedure. 

The proposed methods can be extended to nonlinear 

media. For the time-domain procedure one can utilize the 

polarization method [4], which allows the formulation of 

the eddy-current integral equation [2]. For the 

frequency-domain procedure, one can adopt the method 

proposed in [5], [6]. Since in some problems, for instance 

in electromagnetic levitation problems, the air regions are 

relatively large with respect to the conducting or/and 

ferromagnetic regions, the weight of the fundamental 

harmonic in the harmonic spectrum is significant and, 

thus, for the convergence acceleration in the polarization 

method one can efficiently employ the technique 

proposed in [7]. 

The methods presented can be applied to 

three-dimensional structures as well, by adopting the 

eddy-current integral equation proposed in [8] and 

extended in [2] to nonlinear media and to moving bodies. 

In this case, the spatial discretization of (16) is done by 

decomposing the induced current density using functions 

of the form W 

, where W are edge elements. When 

edge elements of the first order are used, then the current 

density is constant inside the tetrahedral volume elements 

and the relations presented in this paper remain valid. 

Now, the integrals similar to those in (9) and (10) can 

only partially be evaluated analytically. 

Finally, it should be remarked that the procedures in 


[2], [5], [6] and [7] allow the extension to nonlinear 

media of the proposed frequency-domain technique 

which, as illustrated in this paper, could yield a 

spectacular reduction in the amount of computation 

needed. 


This work was supported in part by the Romanian 

Ministry of Labour, Family and Social Protection through 

the Financial Agreement POSDRU/89/1.5/S/62557 and 

by a grant of the Romanian National Authority of 

Scientific Research, CNDI-UEFISCDI, project number 

PN-II-PT-PCCA-2011-3.2-0373. 

REFERENCES 

[1] S. Kurz, J. Fetzer, G. Lehner, and W. M. Rucker, “A novel 

formulation for 3D eddy current problems with moving bodies 

using a Lagrangian description and BEM-FEM coupling,” IEEE 

Trans. Magn., vol. 34, no. 5, pp. 3068-3073, Sep. 1998. 

[2] R. Albanese, F. Hantila, G. Preda, and G. Rubinacci, “Integral 

formulation for 3-D eddy current computation in ferromagnetic 

moving bodies,” Rev. Roum. Sci. Techn., Electrotechn. et Energ., 

vol. 41, no. 4, pp. 421-429, 1996. 

[3] I. R. Ciric, F. I. Hantila, and M. Maricaru, “Field analysis for thin 

shields in the presence of ferromagnetic bodies,” IEEE Trans. 

Magn., vol. 46, no. 8, pp. 3373-3376, Aug. 2010. 

[4] F. Hantila, “A method of solving magnetic field in nonlinear 

media,” Rev. Roum. Sci. Techn., Electrotechn. et Energ., vol. 20, 

no. 3, pp. 397-407, 1975. 

[5] I. R. Ciric and F. I. Hantila, “An efficient harmonic method for 

solving nonlinear time-periodic eddy-current problems,” IEEE 

Trans. Magn., vol. 43, no. 4, pp. 1185-1188, Apr. 2007. 

[6] I. R. Ciric, F. I. Hantila, M. Maricaru, and S. Marinescu, “Efficient 

analysis of the solidification of moving ferromagnetic bodies with 

eddy-current control,” IEEE Trans. Magn., vol. 45, no. 3, pp. 

1238-1241, Mar. 2009. 

[7] I. R. Ciric, F. I. Hantila, and M. Maricaru, “Convergence 

acceleration in the polarization method for nonlinear periodic 

fields,” COMPEL, vol. 30, no. 6, pp. 1688-1700, 2011. 

[8] R. Albanese and G. Rubinacci, “Integral formulation for 3D 

eddy-current computation using edge elements,” IEE Proceedings 

A , vol.135, no.7, pp.457-462, Sep. 1988.


Adaptive Inductance Computation on GPUs 

A.G. Chiariello, A. Formisano and R. Martone 

Dipartimento di Ingegneria Industriale e dell’Informazione 

Seconda Università di Napoli, Via Roma 29, Aversa (CE), Italy 

E-mail: Alessandro.Formisano@Unina2.it 

Abstract—Inductances computation involving highly complex geometries and linear materials can be tackled by discretizing 

coils into simpler elements, whose magnetic behavior is analytically expressible but, to achieve reliable results, very high 

numbers of elements may be required. In such cases, advantages can be taken from GPU capabilities of dealing efficiently 

with simple computational tasks. In the paper, a code able to compute self and mutual inductances of any 3D coils, taking 

advantage of GPU capabilities, is presented. 

Index Terms— GPU, High Performance Computing, Inductance Computation 


The computation of self and mutual inductances for 

complex 3D shaped coils is a demanding task, since no 

general formulas exist. Numerical computations, in order 

to achieve reliable results, require large discretization 

efforts and, in general, multiple runs. As a consequence, 

in a number of practical cases the computer burden could 

be very high. 

Accuracy for generally shaped coils and computational 

promptness represent conflicting objectives, especially in 

optimal design [1, 2], and various computational 

paradigms have been proposed to achieve reasonable 

trade-offs. One of the most promising approaches is the 

adoption of High-Performance Computing (HPC) 

architectures; among HPC approaches, an effective 

solution is the use of Graphic Processing Units (GPU), 

easily available even on desktop class hardware. 

On the other hand, in order to exploit at their best these 

peculiar architectures, a revising of simulation codes is 

often necessary, and new solutions, well suited for CPUbased 

computational environments, must be adopted. 

If assuming that no magnetic materials are present (i.e. 

the relative permeability rel is equal to 1 everywhere), 

following well established approximation formulas [3], 

self and mutual inductances can be computed using 

“segmented” approximations. The basic idea is to 

decompose coils into simpler elements, for which self or 

mutual inductances can be easily computed, eventually 

using closed form expressions. Superposition is then used 

to get the final value of coils self or mutual inductance. 

In this paper a method able to compute self and mutual 

inductances in air for generally 3D shaped massive coils, 

based on coil decomposition into filamentary elements, 

called current sticks, is presented, and its implementation 

on HPC environments, based on GPUs, is briefly 

discussed. The method is able to adapt the discretization 

level to the required accuracy, and was implemented in 

the INDIANA code (INDuctance Iterative and Adaptive 

Numerical Assessment). 

The basic objective of INDIANA is the computation of 

self and mutual inductances of coils wound with multiple 

series-connected turns of conductors. In mutual 

inductance computation, INDIANA decomposes both 

coils into a suited number of current sticks, and computes 

the line integral along each stick of the “target” coil of the 

vector potential A generated by each stick of the “source” 

coil, with unit current. Then, the procedure, taking 

advantage of the concept of “partial inductance” [4] and 

of the linearity assumption, sums all contributions to get 

the final, overall required value. Self inductances are 

computed by using the same coil for both source and 

target, but extracting the singular case of “self” 

computations for each element, which are treated using 

the expression for self inductance of a current stick [3]. 

This scheme suits quite well for GPU-based 

computations, as will be further discussed in Sect. III. 

In the following, a short overview of the relevant 

points in the method will be presented (Sect. II), some 

comments on the GPU implementations are reported 

(Sect. III), and a few examples are discussed to help 

assessing the method capabilities (Sect. IV). Finally, in 

two final annexes, a brief description of the GPUs 

architecture and programming paradigms is given. 

II. MATHEMATICAL FORMULATION 

The decomposition of massive coils into elementary 

components is performed in two steps, taking into 

account the structure of the winding, the distance between 

the coils, and the local curvature of each coil. 

As a first step, each coil is decomposed in as many 

filamentary conductors as required by accuracy needs. 

The typical figure adopted is as many as the actual 

conductors in the Winding Pack (WP) of each coil. 

However the figure can be increased according to the 

adopted technology (e.g. in superconducting cable in 

conduit conductor technology, the decomposition can be 

extended down to petals level) to meet the accuracy 

needs. 

As a second step, each conductor is described using an 

interpolating line, typically a spline, defined by a limited 

number of parameters, such as the coordinates of a 

suitable number of control points, constraining the shape 

of the conductor to the required geometrical accuracy. In 

addition the continuous curve is reduced to a collection of 

sticks, defined by a number of suitable break points (see 

Fig. 1 for a schematic view). 

The number of sticks is selected on the basis of the 

accuracy required by magnetic field computation, and can 

vary depending on the local curvature of the interpolating 

curve and on the distance from field points. INDIANA 

performs a first guess for the distribution of break points 

along conductors of the source coil on the basis of 

average curvature radius and on minimum distance

Actual 

Coil 

Current 

Sticks 

Approximation 

First coil: source Second coil: target 

Figure 1: Segmentation of coils centerline into “sticks”: source coil 

(left) and target coils (right). 

between the midpoint of the section of the source coil 

being considered and the target coil. 

If the required accuracy is not fulfilled, the inductance 

computation is assessed by increasing the number of 

break points, and performing a new inductance 

computation. The process iterates until the convergence, 

in Cauchy sense within a prescribed accuracy, is 

achieved. 

The mutual inductance Mjk between the k-th stick on 

the source coil and j-th stick on the target coil can be 

computed either using the classical formulas from [3], or 

by line integrating (numerically) the vector potential 

Ak(x) generated by stick k on stick j: 

(1) 

ˆ 

M jk Akx j tˆ dl j 

 

j 

where j is straight line along the j-th stick, xj is a generic 

point along j, and ˆt is the stick unit vector. The 

expression for Ak is given in [5]: 

1 

ˆ 

cba A 0 a ln 

(2) 

4 

cba where c=j+1-xi, b=j-xi, and a=j+1-j and the coordinates 

of the stick tips. 

c 

xj 

Figure 2: Basic elements form computation of vector potential using 

(2b). 

INDIANA implements a slightly modified version of 

(2), to treat the singularity when computing A on points 

on the source stick axis [6]. The number of Gauss points 

for numerical integration of (2) is chosen, according to 

the requested accuracy, on the basis of the distance 

between centers of source and target points, using a linear 

relationship based on the length of the target stick and the 

distance between mid points of source and target sticks. 

If source and target sticks are coincident, the self 

inductance Mkk of the k-th stick can be computed using 

the expression for a thin beam [3], providing the value in 

Henry if the stick length Lk is given in meters: 

4 2Lk 

Mkk 210 Lkln 

1 

(3) 

r 

 

 

Lk 

where r is the geometric mean distance and is the 

a 

Ak b 

A 


arithmetic mean distance on the corresponding k-th beam 

cross section. Values of r and for different cross 

sections are given in [3], while INDIANA adopts the 

expression for circular cross section and long beams, 

where r is the radius of the cross section and /Lk is 

negligible. 

III. GPU IMPLEMENTATION 

In this section attention will be focused on the porting 

of INDIANA code on the peculiar GPUs hardware. In 

Appendix A, to the benefit of non experts, a short 

introduction to GPU’s architecture is given [7-11], while 

in Appendix B some hints strictly related to the 

peculiarity of the GPU’s hardware are provided for the 

interested programmers [7, 8]. 

The typical computational GPU architecture includes a 

classical CPU section where the GPUs are grafted. In 

order to profitably use the parallel nature of the GPU, any 

code implementing numerical computations needs to be 

split into sequential parts, which are performed on the 

main CPU (or CPUs), and into the numerically intensive 

parts, which can be more effectively performed on the 

GPUs. In this way the best exploitation of GPUs and 

CPUs execution capabilities can be achieved. 

In the INDIANA code the basic computational task, 

i.e. the evaluation of (2), the magnetic vector potential A 

generated by a single stick in a single field point, can 

benefit of the GPUs architecture. As a matter of fact this 

task, which must be repeated a very high number of 

times, can be simply assigned to a thread; then, suitable 

grouping of treads onto computational blocks, can be 

organized to exploit at best the data structure and the 

available resources. 

In order to achieve the peak performance [9], the 

computational kernel needs to use GPU registers; in 

addition, it needs to treat a large number of independent 

instructions to exploit the scheduler capability of the 

graphic card (further details can be found in Appendix 

B). For these reasons the code was structured following 

the flowchart: 

1) Load the field point associated 

to the considered thread. (Global 

memory access) 

2) Load the start and end points of 

the sticks (Global memory access) 

3) Compute the contribution to the 

field of each stick in the bundle 

using (2) 

4) Accumulate all the contributions; 

in a register 

Last 

stick? 

Yes 

5) Store the computed vector field 

in the global memory 

Figure 3: Flowchart showing the INDIANA kernel computation on 

GPU architecture. 

No

The final integration step (1) is executed in the CPU 

side, since it is well suited to CPU characteristics, and its 

impact on the computational burden is quite marginal. 

INDIANA was implemented using the MATLAB© 

parallel computational toolbox, and the core of the code 

has been parallelized on the GPU using CUDA©, an 

extension of the C language for Nvidia© GPU 

programming. A few details are given in Appendix B. 

IV. EXAMPLES OF APPLICATION 

In this section, two groups of examples are presented. 

In the first group, in order to assess accuracy, results from 

INDIANA are compared to standard results for simple 

geometries where analytical formulas are available. In the 

second group, results from INDIANA are compared to 

computations from 3D finite elements to assess speed-up 

of computations for generally shaped coils. 

Computational times are all referred to an intel i7–based 

PC, with 8Gb ram, running Matlab© Ver.7 for the CPU 

computations, and CUDA© Ver. 4.2 for GPU 

computations. 

a. Accuracy Assessment 

Three filamentary single-turn coils have been 

considered in this group. The first one is a circular 

coil, while the other two are elliptical coils. Analytical 

expressions and reference figures were taken from 

[12]. Geometrical details are reported in Table I, 

while a comparison of results is reported in Table II, 

for increasing accuracy (expressed in Parts Per 

Million -p.p.m.- of the reference value), and 

consequently for increasing number of sticks in 

INDIANA calculations. 

z 

C3 

Figure 4: Coils used for Accuracy Assessment 

TABLE I 

COILS USED FOR ACCURACY ASSESSMENT 

Coil # Centre position [m] Radii (a,b) [m] 

C1 (0.0, 0.0, 0.0) 1 (circular) 

C2 (0.0, 0.0, 0.5) (1/3, 2/3) 

C3 (0.6, 2.0, 0.1) (1/3, 2/3) 

Required 

Figure 

x 

b 

TABLE II 

INDUCTANCES FOR INCREASING ACCURACY 

Required 

Accuracy 

[p.p.m.] 

C2 

a 

C1 

Number of 

sticks 

Computational 

times [s] 

Reference 

Value [H] 

MC1-C2 1.0 10 3 1.13 0.30871178 

MC1-C2 0.1 10 4 36.9 0.30871178 

MC1-C3 1.0 10 3 1.17 0.03963496 

MC1-C3 0.1 10 4 35.8 0.03963496 

y 


b. Speed Assessment 

For this second analysis, the mutual inductance of two 

coaxial circular coils has been considered. This 

benchmark case is treated either using INDIANA with 

300 sticks and the 3D FEM package COMSOL 

multiphysics 4.2a (27524 2 nd order tetrahedral elems., 

neglecting any symmetry for the sake of generality). The 

two coils are described in Table III, while results are 

reported in Table IV. 

TABLE III 

COILS USED FOR SPEED ASSESSMENT 

Coil # Centre position [m] Radius [m] 

C1 (0.0, 0.0, 0.0) 0.20 

C3 (0.0, 0.0, 0.1) 0.25 

Method 

TABLE IV 

INDUCTANCES FOR SPEED ASSESSMENT 

Mutual 

Inductance [H] 

Computational time 

[s] 

INDIANA 0.2487 4 

3D FEM package 0.2486 490 

Reference Value [2] 0.2488 --- 

As a second speed test, the computation of mutual 

inductance between a massive solenoidal source coil 

(radius 1.7 m, length 2.0 m, thickness 0.7 m, 14 layers, 38 

turns per layer) and a filamentary coil (Rin=4.77 m, 

Rout=5.83 m, Zlow=5.02 m, Zup=5.11 m, =17.3°) used to 

measure flux across a test surface was considered (See 

Fig. 5). This test case is relevant for flux measurements in 

magnetic confinement fusion devices [13, 14]. The 

accuracy requirements on the flux measurement are rather 

severe, in order to achieve that accuracy a high number of 

sticks can be needed, in these cases the GPU speed 

enhancement can be very useful to complete the 

simulation in a reasonable amount of time. All methods 

proved able to give the correct result of 2.5665048e-3 H. 

The massive source coil has been represented by as 

many conductors as actually present in its WP (that is, 

14×38), while the discretization level along each 

conductor has been varied to improve accuracy. 

Comparison of computational times for various 

discretization levels either for GPU computations, and for 

purely CPU computations for the sake of comparison, are 

reported in Table V. 

Source Coil 

Partial Flux 

Measurement 

Loop 

Figure 5: Sketch of source coil for flux generation in Tokamak devices 

and Partial Flux measurement loop

TABLE V 

SPEED UP FOR MUTUAL INDUCTANCE 

Computational Times [s] 1000 sticks 5000 sticks 

GPU tGPU 5.31 19.4 

CPU tCPU 35.8 125 

Speed up (tCPU/ tGPU) 6.74 6.44 


A numerical code able to compute mutual inductance 

between couples of any massive coils has been presented. 

The code is called INDIANA, and is able to adaptively 

modify its computational parameters to achieve a tradeoff 

between accuracy and computational speed. 

INDIANA code benefit of a significant acceleration 

(up to 7x) thanks to the GPU parallelization. This 

performance allows to easily meet high accuracy request 

in mutual inductance calculations for complex 3D shaped 

coils. 

INDIANA performance has been assessed either in 

terms of accuracy and speed with respect to simple 

geometries presented in literature or complex shapes, 

compared to FEM computations. 

Future activity will be addressed the MPI 

parallelization over a computer cluster where each node 

is equipped with GPU, in order to analyze more complex 

structures. 


Authors wish to thank Mr. M. Nicolazzo from 

CREATE and Mr. M. Fatica from Nvidia for fruitful 

discussions, and valuable hints and suggestions. 

This work was partly supported by Seconda Università 

di Napoli under PRIST grant “Generazione distribuita di 

energia da fonti tradizionali e rinnovabili: aspetti 

ingegneristici e giuridici-economici-ambientali”, partly 

by NVIDIA Corporation and partly by 

ENEA/EURATOM CREATE association. 

REFERENCES 

[1] M. Cioffi, A. Formisano, R. Martone, “Increasing design 

robustness in evolutionary optimisation“, COMPEL, vol. 23, 

pp.187-196, 2004. 

[2] M. Cioffi, A. Formisano, R. Martone, G. Steiner, D. Watzenig, ”A 

fast method for statistical robust optimization”, IEEE Transactions 

on Magnetics , Vol. 42, pp. 1099-1102, 2006. 

[3] F. Grover, Inductance Calculation, New York: D. Van Nostrand, 

1946. 

[4] C. R. Paul, Introduction to Electromagnetic Compatibility, 

Hoboken (NJ): J. Wiley & Sons, 2006. 

[5] H. A. Haus, J. R. Melcher, Electromagnetic Fields and Energy, 

Englewood Cliffs, NJ: Prentice Hall, 1989. 

[6] J. Hanson, S. Hirshman, “Compact expressions for the Biot– 

Savart fields of a filamentary segment”, Phys. of Plasmas, vol. 9, 

pp.4410-4412, Oct. 2002. 

[7] D. Kirk, W. Hwu, Programming Massively Parallel Processors: A 

Hands-on Approach, Elsevier, 2010. 

[8] M. Garland et al., “Parallel computing experiences with CUDA”, 

IEEE Micro, vol. 28, 2008 pp. 13–27. 

[9] V. Volkov. “Better performance at lower occupancy”, Proceedins 

of NVIDIA GPU Technology Conference 2010, San Jose, USA, 

pp. 20-23, Sept. 2010. 

[10] R. Farber, CUDA Application Design and Development, Morgan 

Kaufmann, 2011. 

[11] F. Calvano, G. Rubinacci, A. Tamburrino, G. Vasilescu, S. 

Ventre, “Parallel MGS-QR sparsification for fast eddy current 


NDT simulation” Studies in Applied Electromagnetics and 

Mechanics, vol. 36, pp. 29-36, 2012. 

[12] J. T. Conway, “Exact Solutions for the Mutual Inductance of 

Circular Coils and Elliptic Coils”, IEEE Trans. on Magn., vol 48, 

pp. 81-94, 2012. 

[13] A. J. Donné et al., “Progress in ITER Physics basis, Chapetr 7: 

Diagnostics”, Nucl. Fusion, vol. 47, pp. S337-S384, (2007). 

[14] A. Formisano, J. Knaster J., R. Martone et al., “ITER nonaxisymmetric 

error fields induced by its magnet system”, Fusion 

Engineering and Design, vol. 86, pp. 1053-1056, 2011. 

APPENDIX A: GPU ARCHITECTURES 

Hardware used in computer Central Processing Unit 

(CPU) seems to be reaching the physical limits beyond 

which increase of clocking frequency or of integration 

scale is very hard with present technology. As a possible 

alternative, CPU manufactures are moving to multiplecores 

CPU’s, but the number of the cores is usually 

limited to a few tens. On the other hand, realistic 

treatment of real world applications gives rise to 

computationally demanding numerical models. In order 

to speed up the computations, different parallelization 

paradigms can be considered, a few examples being 

reported in [11]. 

Recently, the Graphic Processing Units (GPUs), 

present on virtually all graphic cards of computers, have 

been proposed as data-parallel coprocessors, used to 

solve compute-intensive science and engineering 

problems, since it was observed that the mathematical 

processing in high resolution images are very similar to 

the computations usually required in numerical models of 

physical phenomena. 

A modern GPU can have up to 1024 processor cores or 

Streaming Processors (SPs) grouped in Streaming 

Multiprocessors (SMs), each containing eight processor 

cores. 

In order to reduce the dimension of chip area dedicated 

to the control unit, each core in an SM use a parallel 

computation paradigm called Single Instruction, Multiple 

Data (SIMD), where concurrent processor execute the 

same code (called Kernel) on different data. 

The tasks submitted at each core are called threads; the 

threads are grouped in thread blocks (fig. 6b); the 

maximum dimension of a block is presently 512 threads; 

hence, a code need to launch a lot of thread blocks. For 

these reason the thread blocks are grouped into a grid of 

thread blocks (fig. 6c). The threads in a block can be 

indexed using a 3D identifier, a block in a grid can be 

indexed using a 2D identifier. 

A block of threads is assigned at a SM; each SM can 

use 8,192 registers; the registers are the faster memory 

inside a GPU but they are dynamically partitioned among 

the threads inside the blocks; each thread can only access 

its own registers (fig. 6a). All threads inside a block can 

cooperate with the others sharing memory using an onchip 

low latency memory called shared memory (48 kB); 

the shared memory bandwidth is about 6x lower respect 

the register bandwidth. 

The GPU has a memory, called Global memory, where 

the CPU can upload the input data and download the 

result of the computation. The global memory is available 

at all the SMs; it is the largest memory inside a GPU, up 

to 6 GB in modern solutions.

APPENDIX B: GPU PROGRAMMING STRATEGY 

In order to obtain the best results, a programmer needs 

to take into account the peculiar hardware characteristic 

of the GPUs, in each step of the program [7, 10]. 

The access to Global Memory are time consuming, in 

order to increase the access rate each time a location is 

accessed, many consecutive locations are accessed by the 

hardware. 

a) thread 

b) thread blocks 

… 

c) grid of thread blocks 

… 

Figure 6: GPU memory model 

A typical GPU program will follow the steps showed 

in Fig.7. In order to obtain the peak performance the 

programmer need to reorganize, in the host side (Fig.7 

step a), the input data in such a way that adjacent threads 

operate on adjacent data in global memory (Coalescing 

Access). In this case, the hardware combines, or 

coalesces, all of these accesses into a unique access to 

consecutive locations. An example related to geometric 

data is showed in figure 8. In order to allow coalescing 

accesses, duplication of data can be needed. 

Figure 7: Flowchart of a typical GPU code 

Register 

… … 

Shared 

memory 

a) Upload part of the input data from the 

CPU memory to the GPU global memory 

b) Use the thread and block index to select 

the data from the GPU global memory 

c) Compute the task and store the results 

in GPU global memory 

d) Download the result from the GPU global 

memory to the CPU memory 

Global memory 


a) 

b) 

X1 Y1 Z1 X2 Y2 Z2 …. XN YN ZN 

Thread 1 (th1) th2 thN 

X1 X2 …XN Y1 Y2 … YN Z1 Z2 ZN 

th1 th2 thN 

Figure 8: a) Uncoalescing access pattern, b) Coalescing access pattern 

As already discussed, a thread can access three kinds 

of memory: global memory, shared memory and the 

registers. The registers are the fastest memory; hence, in 

order to achieve the best performance, the programmer 

has to use as many registers as possible. 

Of course, the actual speed up can be limited by the 

amount of on chip memory resources (as registers and 

shared memory) and by the shared memory bandwidth. 

The GPU has a sophisticated scheduler very effective 

in minimizing the performance loss due to access to 

global memory. If a sufficient number of threads is 

available, the scheduler can concurrently run the threads 

on the multiprocessor, masking the memory accesses. 

This approach is called in literature Thread Level 

Parallelism (TLP) approach [7-9]. 

The larger is the number of threads, the fewer are the 

registers available per each thread (the registers are 8192 

and are partitioned among the threads inside a block). 

Unfortunately the actual availability of a limited number 

of registers per thread can be a bottleneck for the entire 

code. 

In order to increase the speed up, the scheduler is 

provided by the capability to analyze the instruction flow 

and evaluate how reliable could be to execute two 

instructions at the same time. If the answer is positive, the 

hardware localizes possible free units and increases the 

parallelism and executes more than one instruction during 

the same clock cycle. Then, a small number of threads 

per block usually are recommended (64 threads per block 

can be a good compromise) and, in addition, the thread 

have to be designed to present as many independent 

instructions as possible: in such a way the Instruction 

Level Parallelism (ILP) [7-9] increases the performance.


The Reduced Basis Method Applied to 

Transport Equations of a Lithium-Ion Battery 

Stefan Volkwein∗ ∗ † 

, Andrea Wesche 

∗Universität Konstanz, Fachbereich Mathematik und Statistik , Universitätsstraße 10, D-78457 Konstanz, 

E-mail: stefan.volkwein@uni.konstanz.de 

† Adam Opel AG, Bahnhofsplatz, D-65423 Rüsselsheim, E-mail: Andrea.Wesche@de.opel.com 

Abstract—In this paper we consider a coupled system of nonlinear parametrized partial differential equations (P 2 DEs), 

which models the concentrations and the potentials in lithium-ion batteries. The goal is to develop an efficient reduced 

basis approach for the fast and robust numerical solution of the P 2 DE system. Numerical examples illustrate the efficiency 

of the proposed approach. 

Index Terms—finite volume method, greedy algorithm, lithium-ion battery, reduced basis method 


The modelling of lithium-ion batteries has received an 

increasing amount of attention in the recent past. Several 

companies worldwide are developing such batteries 

for consumer electronic applications, in particular, for 

electric-vehicle applications. To achieve the performance 

and lifetime demands in this area, exact mathematical 

models of the battery are required. Moreover, the multiple 

evaluation of the battery model for different parameter 

settings involves a large amount of time and experimental 

effort. Here, the derivation of reliable mathematical 

models and their efficient numerical realization are very 

important issues in order to reduce both time and cost in 

the improvement of the performance of batteries. 

In the present work we consider a mathematical model 

for lithium-ion batteries which describes the transport 

processes by a partial differential equation system. This 

model is developed in the paper by Popov et al. [17]. The 

physical and chemical details can be found in [13] and 

[14]. The equation system models a physico-chemical 

micro-heterogeneous battery model. 

We discretize this by the finite volume method and the 

backward Euler method. The reduced basis methodology 

for by finite volumes discretized systems can be found 

in [10]. The discretized model is reduced by the reduced 

basis method [16]. Our numerical tests will illustrate the 

efficiency of this approach. 

A popular battery model is the one developed by Newman 

[4], [15], which was implemented and tested [5]. Let 

us also refer to the work [6], where a different battery 

model is derived. For an equation system which describes 

a physico-chemical macro-homogeneous battery model 

the well posedness is shown by Wu et al. [19]. 

II. BATTERY MODEL 

PDE Model 

Let Ω ⊂ R be an open interval, which is divided in three 

disjunct open sub-intervals Ωc, Ωe, Ωa ⊂ R, see Figure 

2.1. For tend > 0 we define Q ∶= Ω ×(0,tend) and let 

c, φ ∶ Q → R and α, β, λ, κ ∶ R2 → R, the notations 

positive electrode 

electrolyte 

negative electrode 

Ωc Ωe Ωa 

BUTLER-VOLMER-equation 

Fig. 2.1. Structure of the considered battery domain 

can be found in Table 1.2 in the appendix. The transport 

processes in a battery, i.e. transport of mass and charge, 

are described by the equations [17] 

∂c 

−∇⋅(α (c, φ)∇c + β (c, φ)∇φ) =0 

∂t 

(2.1a) 

−∇ ⋅ (λ (c, φ)∇c + κ (c, φ)∇φ) =0 (2.1b) 

in Ωc ×(0,tend), Ωe ×(0,tend) and Ωa ×(0,tend), where 

“∇” denotes the gradient and “∇⋅” the divergence. 

The positive electrode is Ωc (cathode for discharge), 

the electrolyte Ωe and the anode Ωa (anode for discharge). 

Boundary/Interface Conditions 

The boundary conditions are 

∂c 

∂ν = 0, φ = 0 on (∂Ω ∩ ∂Ωc)×(0,tend) (2.2a) 

∂c 

= 0, 

∂ν 

∂φ I 

=− 

∂ν σa 

on (∂Ω ∩ ∂Ωa)×(0,tend) (2.2b) 

where ν is the outer unit normal vector, I ∶ R + → R 

(time-dependent current) and σa ∈ R + /{0} (electric conductivity 

multiplied with the cross section). The initial 

condition is given by 

c(x, t0 = 0) =c0 (x) ,x∈ Ω (2.3)

The interface conditions are given by 

−(α (c, φ)∇c + β (c, φ)∇φ) 

⎧⎪ I(cec,cc,φec,φc) ∣ in (∂Ωc ∩ ∂Ωe) 

(0,tend) 

= ⎨ 

⎪⎩ 

−I (cea,ca,φea,φa) ∣ in (∂Ωe ∩ ∂Ωa) 

(0,tend) 

−(λ (c, φ)∇c + κ (c, φ)∇φ) 

⎧⎪ J(cec,cc,φec,φc) ∣ in (∂Ωc ∩ ∂Ωe) 

(0,tend) 

= ⎨ 

⎪⎩ 

−J (cea,ca,φea,φa) ∣ in (∂Ωe ∩ ∂Ωa) 

(0,tend) 

(2.4a) 

(2.4b) 

where cea is the concentration in the electrolyte at the 

negative electrode interface: 

cea (t) =lim 

h→0 c (x∣ Ωe∩Ωa − h, t) 

and h > 0 is small enough, i.e. x∣ − h ∈ Ωe. 

Ωe∩Ωa 

Analogously 

cec (t) =lim 

h→0 c (x∣ Ωe∩Ωc + h, t) 

cc (t) =lim 

h→0 c (x∣ Ωe∩Ωc − h, t) 

ca (t) =lim 

h→0 c (x∣ Ωe∩Ωa + h, t) 

for sufficient small h > 0. Thevariablesφc, φa, φec, φea 

are defined in the same way. We write cs for the concentration 

in the solid part, i.e. in the negative and positive 

electrode, and ce for the concentration in the electrolyte, 

φs and φe are analogously denoted. The scalar functions 

I∶R 4 → R and J∶R 4 → R are defined by 

I(ce,cs,φe,φs) = J(ce,cs,φe,φs) 

√ √ 

F 

√ 

ce cs 

J(ce,cs,φe,φs) =k 

1 − cs 

c 0 e 

c 0 s 

cs,max 

⋅ 2sinh( F 

(φs − φe − U0 (cs))) 

2RT 

where F = 96486 A⋅s 

is the Faraday constant, R = 

mol 

A⋅V ⋅s 

8.314 is the gas constant and T > 0 [K] is the 

K⋅mol 

temperature. The function U0 ∶ R → R is the over 

potential and depends on the concentration c in the 

electrodes. The coefficient functions are defined as 

α (c, φ) ∶=De (c, φ)+ RT 

F 2 

(t+ (c, φ)) 2 κ (c, φ) 

c 

t+ (c, φ) 

β (c, φ) ∶=κ (c, φ) 

F 

λ (c, φ) ∶= RT 

F 

t+ (c, φ) κ (c, φ) 

c 

[ cm2 

s ] 

[ mol 

V ⋅ cm ⋅ s ] 

[ A ⋅ cm2 

mol ] 

where the transference number t+ is zero in the electrodes 

and larger than zero in the electrolyte and κ 

is the ionic/electric conductivity; κ, t+, De ∶ R 2 → R. 

To measure the battery parameters experimentally it is 

assumed, that they are polynomials in c and φ. 

The homogeneous Neumann boundary conditions for 

the concentration (2.2) mean that no flux of lithium(-ions) 


can pass through. The inhomogeneous Neumann boundary 

condition for the potential is Ohm’s law, the homogeneous 

Dirichlet boundary condition have no physical 

meaning. It ensures the uniqueness of the solution if one 

exists. 

The interface conditions describe the exchange of the 

lithium-ions at the interfaces which are modeled by the 

Butler-Volmer-equation [1]. 

For physical reasons we assume that 

c (x, t) ≥0 ∀x ∈ Ω, t∈(0,tend) 

We remark that the coefficient functions α and κ are 

larger than zero for physical reasons: the diffusivity De 

and the conductivity κ are larger than zero. Because of 

the definition of the transference number t+ the coefficient 

functions β and λ are equal or larger than zero. 

Discretization of the Problem 

We discretize the partial differential equation system 

(2.1a)-(2.1b) with the appropriate boundary (2.2) and 

interface conditions (2.4) by the cell centered finite 

volume method. We divide therefore Ωc in Nc ∈ N, Ωe in 

Ne ∈ N and Ωa in Na ∈ N, ND = Nc+Ne+Na, equidistant 

control volumes of the width Δx. We use the method of 

lines and solve the equation system for every time step. 

The time step size is Δt. The integrals over the spatial 

we approximate by the middle point rule, the integrals 

over the time by the backward Euler method, for details 

cf. [17]. These discretized equations are implemented in 

MATLAB 7.10.0 (R2010a). 

III. REDUCED BASIS METHOD 

Initial Point 

We consider a parametrized PDE which we want to 

solve for many parameter sets, e.g. for parametric studies. 

The better the numerical model approximates the physical 

phenomenon, the more expensive the computation 

gets. So in some cases e. g. a parameter analysis needs 

too much effort, because a single computation is too 

expensive. Therefore one has to develop a reduced model 

to get cheap solutions. 

The reduced basis method is based on the discretized 

model: the idea is to compute a “few” times an expensive 

solution to different parameter sets which are in the range 

of interest. With the knowledge of these so called “true” 

solutions basis vectors are computed. The approach is 

that the reduced solutions in the parameter set of interest 

are linear combinations of these basis vectors. 

An assumption is that the error between the “exact” 

analytical solution and the “true” numerical solution is 

small in contrast to the error between the “true” and the 

“reduced” solution. 

A big advantage of the present method is that you 

determine the error between the true and reduced solution 

during “developing” your reduced model (→ Greedy 

algorithm). A further property is that the method has 

two phases: the offline computation in which the reduced

model is set which fulfills the given error tolerance and 

in which the needed true solutions are computed and 

the online phase in which the reduced solution(s) are 

computed. The offline part is expensive and the online 

phase is cheap. 

Approach 

In the following we have to resolve how to choose the 

true solution and how to estimate the error between the 

reduced and true solution. The (POD-)Greedy algorithm 

ensures both issues. 

We now describe how to apply the reduced basis 

method on our battery model. The transport equations of 

the battery (2.1a)-(2.1b) depend on many parameters: on 

geometrical parameters (e.g. the width of the electrode) 

on state parameters (e.g. temperature) and on battery 

parameters (e.g. the diffusion coefficient). We note these 

parameters with μ ∈Dand assume that all these different 

parameters are polynomials in c and φ, some are of 

course constant and so polynomials of the degree zero. 

We write u N ∈X N for a piecewise linear functions, 

its coefficient are denoted with u N ∈ R N . The discretized 

problem is now the following: Find a u N ∈X N so that 

F N (u N ; μ) =0 

⇔⟨F N (u N ; μ) ,v N ⟩W =0 ∀v N ∈X N 

(3.1) 

where the mass matrix W is in the present case given by 

W = Δx ⋅ 1 ∈ R N×N . 

We approximate the finite volume space by a Ndimensional 

space X N which is spanned by “snapshots”. 

A snapshot is a true solution to a specific parameter set 

μ ∈D and time node t ∈(0,tend). We assume now, that 

we have a (orthonormalized) basis Ξ =(ξ1,...,ξN )∈ 

R N×N of this space and that the reduced solution can be 

written as 

N 

u N (μ) = ∑ θ uN 

j (μ) ξj (3.2) 

j=1 

If we replace u N in equation (3.1) by u N of equation 

(3.2) and choose v N = ξi ∈ R N we get: 

Ξ T ⋅ W ⋅ F N (θ uN 

(μ)⋅Ξ; μ) =∶F N (θ uN 

(μ)) 

=0 (3.3) 

The start vector for the coefficient vector for Newton’s 

method one can get by 

u start 

coeff (μ) =ΞT ⋅ W ⋅ u N (⋅,t0 = 0; μ) 

To find a basis we use for our time dependent problem 

the POD-Greedy algorithm, cf. algorithm 1 and [10]. It 

consists of two loops: the outer loop is the “standard” 

Greedy, cf. for instance [16], which finds the new parameter 

set to which the error between the reduced and 


true solution is the largest. For this we need a problem 

specific error estimator; an error estimator for a linear 

by finite volumes discretized problem can be found in 

[10]. The inner loop reduces the trajectory u N (⋅,tn; μ ∗ ) 

in time with the POD (proper orthogonal decomposition) 

algorithm. This algorithm returns for each snapshot matrix 

u N (μ) ℓ ∈ N eigen-/basis vectors. One can state the 

number of basis vectors or the projection error of the 

POD method. For details to the POD algorithm cf. for 

instance [11], [18]. 

Usually we take an error estimator to estimate the error 

between the true and reduced solution to each parameter 

of the discretized parameter set. Until now we have no 

error estimator for the present problem, so we compare 

the true solution to a parameter set of the training set, 

to the reduced solution computed by the so far reduced 

basis vectors. The function u (i) 

RB (μj) is the reduced 

solution constructed by i basis functions evaluated at the 

parameter set μj ∈Dtrain ⊂D. 

Algorithm 1 POD-Greedy algorithm, c.f [10] 

Require: ● Limit the parameter range, discretize the 

parameter set Dtrain ={μ1,...,μNP } 

● Ξtrain = {u N (μ1) ,...,u N (μNP )}, uN (μi) ∈ 

R Nx×Nt , ∀i ∈{1,...,NP } 

● Choose a tolerance for the Greedy: TOLGreedy 

● Choose the exactness for the POD basis per ∈ 

[0, 1], it is just a measurement for the projection 

error (or directly choose the number of POD 

basis elements in each “Greedy step” ℓ). 

Ensure: 

1: Initializing: 

● Choose μ (1) ∈Dtrain → ˜ ξ (1) ∈ Ξtrain 

ξ (1) = POD( ˜ ξ (1) ) ∈ R Nx×ℓ (1), with the POD 

tolerance per 

● Set: i = 2, ɛ = 1 

2: while i ≤ NP and ɛGreedy > TOL do 

3: [μ (i) ,ɛ]=max j∈{1,...,NP } ∣u (i−1) 

RB (μj)−u N (μj)∣ 

4: ˜ ξ (i) = POD(u N (μ (i) )) NT 

n=0 ∈ RNx×ℓ (i), where 

u N (μ (i) ) ∈ Ξtrain 

5: (ξ (1) ,...,ξ (i) ) 

= Gram-Schmidt (ξ (1) ,...,ξ (i−1) , ˜ ξ (i) ) 

6: end while 

An essential property of the reduced basis method is 

that you can decompose the computation into an offline 

and online phase. 

Offline: After determination of the set of parameter 

sets and the accuracy of the reduced solutions, 

we start the Greedy algorithm to compute the 

basis vectors, the true solutions to the chosen 

parameters respectively. The offline phase is 

computationally expensive and so the reduced 

basis method is only worth if you want to solve 

the equation system many times. 

Online: In the present case we have to compute the

coefficient vector for the basis to the different 

parameter sets. We get it by the damped Newton’s 

method. This phase is cheap. 

The big advantage of the reduced solution is that 

you know how good your reduced solution approximates 

the true solution. A big disadvantage is, that if you 

change your parameter ranges you usually have to do 

the expensive offline computation again. 

RBM applied on the battery model 

In the following we explain how to apply the reduced 

basis method on our discretized problem (3.1). 

We denote by F N 1 (C, Φ) ∶R N × R N → R N equation 

(2.1a) with boundary and interface conditions discretized 

by the finite volume method, analogously F N 2 (C, Φ) ∶ 

R N × R N → R N stands for the finite volume discretized 

equation (2.1b) with the boundary and interface conditions. 

We choose a parameter training set for C and Φ 1 : 

Ξ C 

train = (c N (μ1) ,...,c N (μNP )) 

Ξ Φ 

train = (φ N (μ1) ,...,φ N (μNP )) 

Let us assume that we have a basis matrix for C 

) ∈ RND×NBc and for Φ ΨΦ = 

ΨC = (ψc 1,...,ψc NBC (ψ φ 

1 ,...,ψφ )∈R NBφ ND×NBφ then the reduced models, 

reduced functions respectively, are given by 

(Ψ c ) T ⋅ W ⋅(F N 

1 (Ψ c Ccoeff , Ψ φ Φcoeff )) ! = 0 (3.4a) 

(Ψ φ ) T 

⋅ W ⋅(F N 

2 (Ψ c Ccoeff , Ψ φ Φcoeff )) ! = 0 (3.4b) 

We should add that the snapshots for Ψ c and Ψ φ are 

taken at the same parameter sets (outer POD-Greedy), 

but the number of the POD basis elements could differ 

to achieve the same accuracy (inner POD-Greedy). 

A further issue is how to choose the next parameter 

in the (POD-)Greedy algorithm. The error estimation we 

have to do for two functions c and φ. So we usually 

get two different parameter values μc and μφ, where 

the L 2 -errors of the concentration and of the potential, 

respectively, attain their maximum values. Then, we 

choose the parameter μ ∈{μc,μφ} corresponding to the 

greater L 2 -error of both. 

IV. NUMERICAL EXPERIMENTS 

In this section we apply the reduced basis method to 

the discretized equations describing the transport processes 

in a lithium-ion battery (3.1). The step size in 

spatial is Δx = 1μm, the time step size Δt = 5s and we 

compute Nt = 10 time steps. The Newton tolerance to 

compute the true solution is set 10−6 relatively and 10−9 absolutely. The discretization error of the finite volume 

solution is ɛFVM =O( 1 

100 ). 

1 One can also choose different training set, e.g. cf. [7] 


pos. electrode electrolyte neg. electrode 

De, [ cm2 

s ] 1.0 ⋅ 10−9 7.5 ⋅ 10−7 3.9 ⋅ 10−10 κ, [ A 

] 

V ⋅cm 

c 

0.038 0.002 1.0 

0 , [ mol 

cm3 ] 

cmax, [ 

0.020574 0.001 0.002639 

mol 

cm3 ] 

U0, [V ] 

t+, [−] 

k, [ 

0.02286 

0.001 

0 0.2 

0.02639 

0 

0 

A 

cm2 ] 1.3716 ⋅ 10−4 5.2780 ⋅ 10−7 N⋅, [−] 

A⋅, [cm 

10 30 10 

2 ] (50 ⋅ 10−4 ) 2 

(50 ⋅ 10−4 ) 2 

TABLE 4.1 

BATTERY PARAMETERS, [17] 

The Newton tolerance for the reduced solution is 10 −5 . 

So the L 2 -error between the finite volume and reduced 

solution is at best less than 

ɛ L 2 = 10 −5 ⋅ Nx ⋅ Nt = 0.005 =∶ ɛGreedy 

The tolerance of the POD method is set 99%, cf. for the 

POD method for instance [18]. 

Our “standard” battery parameter set is listed in table 

4.1, notations can be found in Table 1.2 in the appendix. 

We charge the battery with 1.5913 ⋅ 10 −8 A which corresponds 

to 1C-rate and set the temperature T = 300K. 

Test 1 

In this subsection we variate the open circuit voltage 

in the positive electrode: μ = Uc ∈[0.001, 4.501]. We 

discretize this parameter set with the equidistant step 

width ΔUc = 0.1V .Sowehavea46-dimensional training 

set for the parameter. All the other parameters are fixed, 

cf. Table 4.1. 

In Figure 4.1 and 4.2 the finite volumes solutions 

chosen by the first two iterations of algorithm 1 are 

presented. The associated parameters are Uc = 3.001V 

and Uc = 4.501V . The concentration seems to be less sensitive 

to the circuit voltage than the electrical potential. 

The L2-error for the concentration becomes worse after 

the second Greedy step, but already after the first Greedy 

step the error is smaller than the L2-error tolerance 

ɛL2 and stays smaller; the L2-error for the electrical 

potential gets denotative smaller with the information of 

a second true solution, cf. Figure 4.3 and 4.4. The same 

observation can be done for the L∞-error which is not 

presented here. The basis functions are shown in Figure 

4.5: they have the same “structure” as the finite volume 

solutions for one fixed time step and there is just one 

basis function for each Greedy step. 

The speed up of the reduced solution in comparison 

to the true solution is 17.54. 

Test 2 

In this section we variate a few parameters: 

μ ={Dec,Dee,Dea,t+,kc,ka} 

The subscript c denotes the parameter in the positive 

electrode, e in the electrolyte and a in the

c [mol/cm 3 ] 

0.025 

0.02 

0.015 

0.01 

0.005 

0 

20 40 60 

x [μ m] 

0 

t [s] 

50 

U [V] 

0 

−1 

−2 

−3 

−4 

0 20 40 60 

x [μ m] 

Fig. 4.1. Test 1: Finite volume solution fort the concentration (left) 

and the potential (right) for Uc = 3.001V 

c [mol/cm 3 ] 

0.025 

0.02 

0.015 

0.01 

0.005 

0 

20 40 60 

x [μ m] 

0 

t [s] 

50 

U [V] 

0 

−2 

−4 

−6 

0 20 40 60 

x [μ m] 


and the potential (right) for Uc = 4.501V 

x 10−5 

6.5 

6 

5.5 

5 

4.5 

1 2 3 4 

U [V] 

0c 

0.08 

0.06 

0.04 

0.02 

0 

0 

0 

t [s] 

t [s] 

1 2 3 4 

U [V] 

0c 

Fig. 4.3. Test 1: L 2 -error after the first Greedy step for the 

concentration (left) and the potential (right). 

x 10−5 

7.019 

7.019 

7.019 

7.019 

7.019 

7.019 

1 2 3 4 

U [V] 

0c 

x 10−3 

3.6949 

3.6949 

3.6949 

3.6949 

3.6949 

1 2 3 4 

U [V] 

0c 

Fig. 4.4. Test 1: L 2 -error after the second Greedy step for the 


40 

20 

0 

−20 

−40 

−60 

Basis functions for the concentration 

−80 

0 10 20 30 40 50 

Spatial 

Greedy step 1 

Greedy step 2 

30 

20 

10 

0 

−10 

Basis functions for the potential 

−20 

0 10 20 30 40 50 

Spatial 

50 

50 

Greedy step 1 

Greedy step 2 

Fig. 4.5. Test 1: Basis functions for the first Greedy step for the 


negative electrode. We choose the following parameter 

set range: for the diffusion coefficients Dec ∈ 

[1.0 ⋅ 10 −9 , 1.1 ⋅ 10 −9 ], Dee ∈ [7.5 ⋅ 10 −7 , 7.6 ⋅ 10 −7 ], 


Dea ∈ [3.9 ⋅ 10 −10 , 4.0 ⋅ 10 −10 ], the transference number 

t+ ∈ [0.2, 0.3] and the reaction rates kc,ka ∈ 

[0.02, 0.022]. We discretize the parameter set in the 

following way: for the diffusion coefficients we choose 

the boundary values, for the other parameters we also 

take the boundary values and a value in between. With 

this discretization we get a 216-dimensional trainings 

set. All the other parameters are fixed like in table 4.1 

noted, but in contrast to the previous subsection the POD 

tolerance is set 1 − 1 ⋅ 10 −8 %. 

The graphical results are listed in the Figures 4.6 and 

4.7: In Figure 4.6 the finite volume solutions for the first 

parameter set can be seen. The first parameter set is the 

one denoted in table 4.1. The L 2 -error of the reduced 

solutions in comparison to the finite volume solution 

is for the concentration smaller than 10 −5 and for the 

electrical potential 10 −4 to all 216 parameter sets. The 

L ∞ -error is smaller than 10 −3 for the concentration as 

well as for the potential after one Greedy step. The basis 

functions are plotted in Figure 4.7: there are four basis 

functions for the concentration and three for the potential. 

c [mol/cm 3 ] 

0.025 

0.02 

0.015 

0.01 

0.005 

0 

20 40 60 

x [μ m] 

0 

t [s] 

50 

−1 

−2 

−3 

−4 

x 10 

0 

−3 

−5 

0 20 40 60 

x [μ m] 


and the potential (right) for first parameter set 

80 

60 

40 

20 

0 

−20 

−40 

Basis functions for the concentration 

−60 

0 10 20 30 40 50 

Spatial 

U [V] 

30 

20 

10 

0 

−10 

−20 

−30 

0 

t [s] 

Basis functions for the potential 

−40 

0 10 20 30 40 50 

Spatial 

Fig. 4.7. Test 2: Basis functions after the first Greedy step for the 


The L 2 -error between the reduced solution and the true 

solution is after one Greedy step smaller than the L 2 -error 

tolerance ɛ L 2 to all 216 parameter set. That means that 

just the information of one computational expensive true 

solution is needed to compute the reduced solutions to 

all 216 parameter sets with an acceptable error. 

The speed up of the reduced solution in comparison 

to the true solution is 12.37. 

V. DISCUSSION 

In the present document we do the same approach like 

in [10]. 

The reduced basis approach works for the transport 

equation in a lithium-ion battery: in the above numerical 

50

Notation 

c, [ mol 

cm3 ] concentration of the lithium/lithium-ions 

φ, [V ] electrical potential 

α, [ cm2 

] coefficient function 

s 

β, [ mol 


V ⋅cm⋅s 

λ, [ A⋅cm2 


mol 

I, [A] current 

De, [ cm2 

] 

s 

κ, [ 

diffusion coefficient 

A 

] 

V ⋅cm 

c 

electric/ionic conductivity 

0 , [ mol 

cm3 ] start concentration of lithium in the 

electrodes/electrolyte 

cmax, [ mol 

cm3 ] 

U0, [V ] 

t+, [−] 

k, [ 

maximum of lithium the electrode can store 

open circuit potential 

transference number 

A 

cm2 ] 

N⋅, [−] 

A⋅, [cm 

reaction rates 

number of control volumes 

2 ] cross section 

TABLE 1.2 

NOTATIONS OF THE BATTERY PARAMETERS 

tests there are at most two Greedy steps needed to reach 

the L 2 -error tolerance ɛ L 2. If the knowledge of all finite 

volume solutions to the training set is needed, the method 

would not be sufficient. In the above numerical tests 

we see that the limiting factor is the potential: for the 

concentration one Greedy step is sufficient but for the 

potential we need in some cases an additional Greedy 

step. 

The application of the reduced basis method to this 

problem is not completed yet: We have to develop an 

a posteriori error estimator so that we do not have to 

compute all true solutions to the discretized parameter 

set. Further the computational time of the reduced 

solutions in comparison to the finite volume solutions 

for the presented numerical tests are fast but not so 

fast as it could be. In every Newton step we have to 

evaluate the nonlinearities completely. Also we have no 

affine parameter dependence. If you have a linear(ized) 

affine parameter dependent problem you can separate the 

parameter dependence from the bilinear form and from 

the linear form. To get an affine parameter dependent 

problem as well as a linearized problem we have to 

apply the (discrete) empirical interpolation method, cf. 

for instance [2], [3]. 

VI. ACKNOWLEDGMENTS 

The authors gratefully acknowledge support by the 

Adam Opel AG. Besides Competence Center The Virtual 

Vehicle (Graz) supported the lecture by Mr. Volkwein 

within the scope of the IGTE Symposium. 

APPENDIX 

NOTATION 

In Table 1.2 one can find some notations for the battery 

parameters. 


REFERENCES 

[1] P.W. Atkins, “Physikalische Chemie”, Wiley-VCH, 2., vollst. 

neubearb. A., 1996. 

[2] M. Barrault and N.C. Nguyen and Y. Maday and A.T. Patera, 

“An “Empirical Interpolation” Method: Application to Efficient 

Reduced-Basis Discretization of Partial Differential Equations”, C. 

R. Acad. Sci. Paris, Série I., pp. 667–672, 339, 2004. 

[3] S. Chaturantabut and D.C. Sorensen, “Nonlinear Model Reduction 

via Discrete Empirical Interpolation”, SIAM J. Sci. Comput., 32(5), 

pp. 2737–2764, 2010. 

[4] M. Doyle and T.F. Fuller and J. Newman, “Modeling of Galvanostatic 

Charge and Discharge of the Lithium/Polymer/Insertion Cell”, 

Journal of The Electrochemical Society, 140(6), pp. 1526–1533, 

1993. 

[5] C. M Doyle, “Design and Simulation of Lithium Rechargeable 

Batteries”, Ph.D. thesis, 1995. 

[6] W. Dreyer and M. Gaberscek and C. Guhlke and R. Huth and 

J. Jamnik, “Phase Transition and Hysteresis in a Recharchable 

Lithium Battery Revisited”, European J. Appl. Math., 22, pp. 267– 

290, 2011. 

[7] A.-L. Gerner and K. Veroy, “Certified reduced basis method for 

parameterized saddle point problems”, SIAM J. Sci. Comput., 

(accepted Jul 2012). 

[8] M.A. Grepl and A.T. Patera, ”A Posteriori Error Bounds for 

Reduced-Basis Approximations of Parametrized Parabolic Partial 

Differential Equations”, Mathematical Modelling and Numerical 

Analysis, 2005, 39(1), pp. 157-181. 

[9] M.A. Grepl, Y. Maday, N.C. Nguyen, and A.T. Patera, ”Efficient 

Reduced-Basis Treatment of Nonaffine and Nonlinear Partial 

Differential Equations”, Mathematical Modelling and Numerical 

Analysis, 2007, 41(3), pp. 575-605. 

[10] B. Haasdonk and M. Ohlberger, “Reduced Basis Method for 

Finite Volume Approximations of Parametrized Linear Evolution 

Equations”, Math. Model. Numer. Anal., 42(2), 2008, pp. 277-302. 

[11] P. Holmes and J.L. Lumley and G. Berkooz and C. Rowley, 

“Turbulence, Coherent Structures, Dynamical Systems and Symmetry”, 

Cambridge Monographs on Mechanics, 2012, Cambridge 

University Press. 

[12] O. Lass and S. Volkwein, “POD Galerkin schemes for nonlinear 

elliptic-parabolic systems”, submitted, 2011. 

[13] A. Latz and J. Zausch and O. Iliev, “Modeling of Species and 

Charge Transport in Li–Ion Batteries Based on Non-Equilibrium 

Thermodynamics”, Lecture Notes in Computer Science 6046, 329– 

337, 2011. 

[14] A. Latz and J. Zausch, “Thermodynamic Consistent Transport 

Theory of Li-Ion Batteries”, Journal of Power Sources 196, 3296- 

3302, 2011. 

[15] J. S. Newman and K. E. Thomas-Alyea, “Electrochemical Systems”, 

Wiley John + Sons, 3rd ed., 2004. 

[16] A. T Patera and G. Rozza, Reduced Basis approximation and 

A Posteriori Error Estimation for Parametrized Partial Differential 

Equations, MIT, 2007. 

[17] P. Popov, Y. Vutov, S. Margenov and O. Iliev, ”Finite Volume 

Discretization of Equations Describing Nonlinear Diffusion in Li- 

Ion Batteries,” Fraunhofer ITWM report 191, 2010. 

[18] S. Volkwein, “Model Reduction Using Proper Orthogonal Decomposition”, 

lecture notes, Konstanz, 2011. 

[19] J. Wu and J. Xu and H. Zou, “On the Well-posedness of a 

Mathematical Model for Lithium-Ion Battery Systems, Methods 

and Applications of Analysis, 13(3), pp. 275–298, 2006.


Surrogate Parameter Optimization based on 

Space Mapping for Lithium-Ion Cell Models 

Matthias K. Scharrer∗ , Bettina Suhr∗ , and Daniel Watzenig∗† ∗Kompetenzzentrum – Das Virtuelle Fahrzeug Forschungsgesellschaft mbH (ViF), 

Inffeldgasse 21/A/I, A-8010 Graz, Austria 

† Institute of Electrical Measurement and Measurement Signal Processing, 

Kopernikusgasse 24/4, A-8010 Graz, Austria 

E-mail: matthias.scharrer@v2c2.at 

Abstract—Optimizing batteries of electric cars is a complex and time consuming task. In order to reduce the number of 

prototypes, development costs and time, reliable numerical models are highly required. But optimizing models reflecting the 

fundamental electrochemical processes is typically computationally expensive. In this paper we present a surrogate model 

optimization approach based on space mapping to reduce computation time. This technique is applied to the parameter 

estimation problem of an electrochemical cell model by linking a coarse linearized model to the accurate model. We 

present results of two synthetical fitting problems solved directly and by our surrogate optimization method to validate the 

approach. As a remarkable result 15% reduction of computation time for the one dimensional case and 25% for the two 

dimensional case are obtained. We discuss a simple measure that doubles the achieved reduction to 48% for the latter. 

The method can easily be adopted to speed up other gradient–based optimization problems. Since the used electrochemical 

model shows strong non–linear behaviour, the achieved speed up indicates even better performance in the case of relaxed 

conditions. 

Index Terms—multi physics, space mapping, surrogate optimization 


In terms of pollutant emissions during vehicle operation, 

battery–powered and hybrid vehicles are clearly 

more environmentally friendly than those purely based 

on combustion engines. In order to reduce the number of 

expensive prototypes the fast and reliable simulations of 

the electrical and chemical behavior of cells are becoming 

increasingly important. Also a more efficient operation of 

a battery can be achieved, when simulation models are 

used to estimate the battery’s internal states – e.g. state– 

of–charge (SOC), state–of–function and state–of–health – 

from measurement data. 

Many internal variables and material properties are 

difficult to access or not measurable. Several approaches 

exist to get insight into the internal dynamic processes in 

a lithium–ion cell. The field was pioneered by Newman 

and co–workers [1], [2]. Overviews can be found in [3] 

and [4]. The authors focus on modeling the cell in terms 

of transport equations for lithium ions, chemical interaction 

and electronic field computations in active particles 

of anode and cathode. These are coupled by modeling 

electrode kinetics occuring on the particle surfaces of 

such electrodes. 

Recently, much effort is put into estimating parameters 

for cell models to gain better knowledge about effects 

occuring during life time. In this work we focus on non– 

invasive methods only, i.e. methods that estimate parameters 

by matching predicted cell model voltages for a given 

current profile to experimental measurements without 

the need to destructively open the cell. For example, 

Santhanagopalan and White [5] devised an online SOC– 

estimation method by applying an extended Kalman 

filter to a simplified ordinary differential equation model. 

In [6], [7] a gradient based method to parameter estimation 

is introduced – based on Levenberg–Marquardt 

optimization. Here Santhanagopalan et al. investigate 

both single and multi–particle systems and successfully 

identify five parameters for constant current charge and 

discharge cycles. Schmidt et al. estimate parameters 

of a single–particle model in a combined approach by 

performing Fisher–information based parameter analysis 

and applying a pattern search algorithm consecutively [8]. 

In contrast to the usage of single–particle models (SPM) 

and deterministic estimation algorithms above, Forman et 

al. [9] focus on fitting the Doyle–Fuller–Newman (DFN) 

model [1] to battery cycling data by application of genetic 

algorithm. 

The above literature provides an overview of a range 

of different methods and models to estimate the intrinsic 

material properties and unknown states. In contrast to 

using a SPM, we try to find a mechanism that allows 

us to use the higher detailed DFN model, as suggested 

in [9]. But, as opposed to the latter, we try to further 

improve the speed of parameter estimation up to online 

application if possible. 

In this paper the application of the so called space 

mapping method – first mentioned in [10] – to the 

parameter estimation of the cell model is investigated. 

Thus, it is possible to apply a fast gradient based Gauss– 

Newton optimization method to the complex DFN model 

and use a simplified model as a surrogate to both, speed 

up direct model evaluation and gradient computation at 

once.

The remainder of this paper is structured as follows: 

Section II defines the simulation framework of the cell 

model and briefly summarizes the solution procedure. A 

mechanistic model describing the electrochemistry of a 

lithium–ion cell motivated by the DFN model [1] has 

been implemented as a system of coupled non–linear 

partial differential equations (PDEs) in one dimension 

[11]. In Section III the optimization problem and solution 

algorithm to estimate the parameters is defined. 

Section IV presents a framework how to replace many of 

the time consuming evaluations of the complex forward 

model by a surrogate model – a linearization of the 

complex model in this case – and how a link between 

the two models is established. In Section V we present 

and discuss the results. Finally, Section VI summarizes 

and concludes the paper. 

II. FORMULATION OF THE PROBLEM 

In order to mathematically describe the internal dynamic 

processes in a lithium–ion cell, a mechanistic 

electrochemical model has been realized as a system 

of coupled non–linear partial differential equations in 

one dimension [11]. A lithium–ion cell with two porous 

intercalation electrodes (cathode in Ωc and anode in 

Ωa) and an electronically isolating separator in Ωs in 

between is considered. For homogenization purpose each 

electrode is assumed to consist of two phases. We assume 

spherical particles in both cathode (in Λc) and anode 

(in Λa), which line up continously in x direction. The 

liquid phase modeled in each electrode is electrolyte. 

In the separator Ωs we only consider electrolyte, as the 

solid phase in the separator does not participate in the 

reactions. In Figure 1 a schematic view of the modeled 

domain is given. 

Ri 

, a 

a 

a 

Ro, 

a 

a 

r r 

 

c 

s 

a, 

s c, 

s 

c 

 

 

 

Ro, c 

Fig. 1. Problem Domain: The spatial domains are defined as Ω= 

Ωa ∪ Ωs ∪ Ωc ⊂ R, Ω ′ =Ωa ∪ Ωc, Λa =Ωa × [0,Ra] ⊂ R 2 , 

Λc =Ωc × [0,Rc] ⊂ R 2 , Λ=Λa ∪ Λc and Ra,Rc ∈ R. 

The implemented model is similar to the widely 

used DFN approach [1], extended to include additional 

aspects, e.g. from [2] and [12]. The full model will 

be described in [13]. It is a coupled system of non– 

linear partial differential equations in one dimension. 

The variables are potentials and concentrations for the 

electrolyte (ϕℓ,cℓ), for the cathode (ϕsc,csc) and for 

the anode (ϕsa,csa). The one–dimensional cell model 

considered is defined by the system (1). 

c 

Ri , c 

x 


TABLE I 

LIST OF SYMBOLS 

Ai 

inner surface (m 2 m −3 ) 

Dℓ 

diffusivity in electrolyte (m 2 s −1 ) 

Ds diffusivity in solid (m 2 s −1 ) 

Cdl double layer capacity (Fm −1 ) 

F Faraday’s constant (= 96485C mol −1 ) 

R universal gas constant (= 8.31447 Jmol −1 K −1 ) 

T temperature (K) 

UOCV(cs) equilibrium potential function (V ) 

cℓ 

Li + –concentration in electrolyte (mol m −3 ) 

cℓ,0 initial Li + –concentration in electrolyte (mol m −3 ) 

cs 

Li + –concentration in active particles (mol m −3 ) 

cs,0 initial Li + –concentration in particles (mol m −3 ) 

i(t) cell current density (Am −2 ) 

j ∗ 

BV Butler–Volmer current density (Am −2 k 

) 

exchange current density and reaction rate (mol m −2 s −1 ) 

t + 

ℓ 

z 

transference number of cations (1) 

number of transferred electrons per unit (Li + : z =1)(1) 

αA,αK anodic/cathodic charge transfer coefficients (1) 

εℓ 

electrolyte volume fraction (1) 

κ (cℓ) ionic conductivity function (Sm −1 ) 

μℓ 

migration coefficient 

ϕs 

electrochemical potential of the active material (V ) 

ϕℓ 

electrochemical potential of the electrolyte (V ) 

σs 

electronic conductivity (Sm −1 ) 

−∇ · (σs∇ϕs) =−Aij ∗ 

BV in Ω ′ 

 

−∇ · κℓ(cℓ)∇ϕℓ + RT 

 

+ 1 

κℓ(cℓ)t ℓ ∇cℓ = Aij 

zF cℓ 

∗ 

BV in Ω 

∂ (ɛℓcℓ) 

∂t 

−∇· 

 

Dℓ ∇cℓ + zF 

RT μℓcℓ∇ϕℓ 

 

= Ai 

zF j∗ BV in Ω 

∂cs 1 

− 

∂t r2 

∂ 

Dsr 

∂r 

2 

∂cs 

=0 in Λ 

∂r 

j ∗ 

BV = 

⎧ 

αAzF(ϕs−ϕℓ −UOCV(cs)) 

 

zFk exp 

RT 

+ 

⎪⎨ 

 

−(1−αK)zF(ϕs−ϕℓ −UOCV(cs)) 

 

−zFk exp 

RT 

+ 

∂(ϕs−ϕℓ) ⎪⎩ 

+Cdl ∂t 

in Ω ′ 

(1) 

0 else 

where the system variables are defined as ·(t, x) at time 

t ∈ [0,T], T ∈ R and at space point x and (x, r), 

respectively. A comprehensive overview of symbols is 

given in Table I. 

Homogenous Neumann conditions are applied at the 

boundaries except for the outer boundaries of potentials 

and concentrations in solid phase: 

ϕs =0 on Γa × [0,T] 

−σs∇ϕs = −i (t) on Γc 

∂cs 1 

−Ds = 

∂r zF j∗ 

BV on ΓRo,a ∪ ΓRo,c 

The concentrations are restricted by the following 

initial conditions: 

cℓ = cℓ,0 

cs = cs,0 

in Ω 

in Λ 

The potentials are consistently initialized at rest by 

the condition j (x, 0) = 0. The solution of this system of 

four non–linearly coupled partial differential equations 

is done by application of the Finite Element Method 

with linear test functions for spatial discretization and 

Backwards Euler Method for time integration. The non– 

linearity is solved by a damped Newton method – see 

[11] for details. 

(2) 

(3)

III. PARAMETER ESTIMATION 

The system described in the previous section contains 

many parameters which cannot be measured directly. To 

formulate the parameter estimation problem in a general 

way, we merge the parameter set of interest into the 

parameter vector μ ∈ Pad ⊂ R m , where Pad is defined 

as the admissible parameter set. The basis optimization 

problem is introduced as 

μ ∗ =argminH (f (μ)) , (4) 

μ∈Pad where an optimal set of parameters μ ∗ ∈ Pad is sought, 

which minimizes a merit function H of a model response 

f (μ) depending on the parameters μ. 

Since we focus on parameter estimation based on cell 

voltages, we set H to compute the difference with respect 

to a predescribed function ˆy. We rewrite (4) to 

μ ∗ =argminwi (y(ti; μ) − ˆy(ti)) 

μ∈Pad 2 

2 

where we want to minimize the difference between measured 

cell voltages, ˆy, and computed voltages, f (μ) = 

y (·; μ) =ϕs| Γc − ϕs| Γa , at predefined times, ti. Variations 

in time step sizes are taken into account by the 

weights wi. 

Classical optimization using this objective function 

yields unacceptable response times, since not only the 

solution of the system defined in Section II has to be 

computed, but additionally the derivative of the objective 

function with respect to every parameter in our set of 

interest μ has to be estimated. Since this might be 

intractable for non–linear PDE constraint problems, we 

revert to numerical gradient estimation by finite differences. 

As execution time of a single simulation on current 

hardware varies between seconds and hours – depending 

on the prescribed input profile – direct evaluation of (5) 

is not acceptable due to its enormous computation times. 

IV. PROPOSED FRAMEWORK 

To speed up parameter estimation, we introduce space 

mapping – first mentioned in [10] – to the cell model optimization. 

The idea behind is best described as follows: 

We have a very complex and accurate – fine – model 

that describes a process on basis of a couple of parameters. 

We search for an optimal set of parameters with 

respect to some cost function by repeatingly evaluating 

our model and computing model responses for intermediate 

sets of parameters. Since a single evaluation of the 

model is expensive, we replace the responses by results of 

a much cheaper and less accurate – coarse – model (also 

known as surrogate model) describing the same process 

by using a similar parameter set. Thus we only get a 

vague idea of where the optimal parameters are with 

respect to the fine model. Finally, we link the results 

by evaluating the fine model and establish a mapping 

between the individual parameter spaces of both, the fine 

and the coarse model. Since this results in fewer calls of 

the fine model, the optimization time can be reduced. 


(5) 

In our case this means to substitute evaluations of the 

fine model u = F(μ), where u is the tuple representing 

the solutions to (1) and F(·) is the solution operator, 

by evaluations of a coarse model v = C(λ), where λ ∈ 

Lad ⊂ R m is the parameter set of interest of the coarse 

model, v is the tuple representing the solutions of the 

coarse model and C(·) is the solution operator. 

A mapping function p : Pad → Lad enables us 

to establish a link between the two models such that 

the response of the coarse model c(p(μ)) is a good 

approximation for f(μ). Of course, the coarse model 

response c (·) has to be defined the same way as the 

fine model response f (·). Since directly evaluating the 

mapping function p is not possible, we introduce a new 

optimization problem: 

 

 

λ =argminc( ˜λ∈L ad 

˜ 

 

λ) − f(μ) 2 

2 

A problem with this approach is the computation of the 

Jacobian of the space mapping. Bandler suggested a time 

consuming way in [10]. This unnecessary big effort can 

be circumvented by applying Broyden’s formula [14], as 

discussed in [15]. The latter will be used in this paper. 

Using the coarse model response c (·), we reformulate 

the optimization problem in (4) as follows: 

˜μ ∗ =argminH (c (p (μ))) , (7) 

μ∈Pad where ˜μ ∗ is a coarse approximation of μ ∗ . By iteratively 

updating p(·), the solution of the surrogate problem ˜μ ∗ 

is supposed to converge towards the real solution μ ∗ . 

The algorithm applied to indirectly solve the optimization 

problem defined in (5) by means of the space mapping 

is stated in Algorithm 1. 

Following the idea of [16], we obtain the simplified 

coarse model by linearizing the right hand side of the 

original system by approximation on the basis of Taylor 

series expansion: 

 

j (v; λ) ≈ ˆj (û; ˆμ)+ ∇uˆj T 

(û; ˆμ) (v − û)+ ∇μˆj T (û; ˆμ) (λ − ˆμ), 

(8) 

where û denotes the state of the original system for a 

reference parameter set ˆμ, ˆj (û;ˆμ) denotes the function 

j∗ BV in a working point – throughout the rest of the paper 

we write ˆj for a function ˆj (û;ˆμ). The parameters of the 

linearized model are denoted by v and λ, respectively. 

The non–linear factors on the left hand side, i.e. the 

ionic conductivity κℓ and direct occurrences of the Li + – 

concentrations in solution cℓ, are fixed to their initial 

values. In (11) the coarse model is stated as used. In 

addition, the boundary condition of the concentrations in 

solid phase cs changes to: 

 

∂cs 1 

−Ds = ∇u 

∂r zF 

ˆj 

T 

v + ∇μˆj T λ + ˆ 

Jc on ΓRo,a ∪ ΓRo,c 

(9) 

where the constant parts of the linearization (8) are given 

as: 

 

ˆJc = ˆj − ∇uˆj T 

û − ∇μˆj T ˆμ (10) 

(6)


 

∂ˆj 

−∇ · (σs∇ϕs) +Ai ϕs + 

∂ϕs 

∂ˆj 

ϕℓ + 

∂ϕℓ 

∂ˆj 

cℓ + 

∂cℓ 

∂ˆj 

cs = −Ai ∇μ 

∂cs 

ˆj 

T λ + ˆ 

Jc in Ω ′ 

 

−∇ · κℓ(ĉℓ)∇ϕℓ + RT 

 

+ 1 

∂ˆj 

κℓ(ĉℓ)t ℓ ∇cℓ − Ai ϕs + 

zF ĉℓ 

∂ϕs 

∂ˆj 

ϕℓ + 

∂ϕℓ 

∂ˆj 

cℓ + 

∂cℓ 

∂ˆj 

cs = Ai ∇μ 

∂cs 

ˆj 

T λ + ˆ 

Jc in Ω 

∂ (ɛℓcℓ) 

∂t 

−∇· 

 

Dℓ ∇cℓ + zF 

RT μℓĉℓ∇ϕℓ 

 

+ Ai 

 

∂ˆj 

ϕs + 

zF ∂ϕs 

∂ˆj 

ϕℓ + 

∂ϕℓ 

∂ˆj 

cℓ + 

∂cℓ 

∂ˆj 

 

cs = − 

∂cs 

Ai 

∇μ 

zF 

ˆj 

T λ + ˆ 

Jc in Ω 

∂cs 1 

− 

∂t r2 

∂ 

Dsr 

∂r 

2 

∂cs 

=0 in Λ 

∂r 

Algorithm 1 Space mapping surrogate optimization 

Input: Initial μ0 ∈ Pad; set i =0,λ0 = μ0, and B0 = I. 

1: Evaluate f (μ0) and H (f (μ0)) 

2: repeat 

3: Define mapping function 

pi (μ) ← Bi (μ − μi) +λi 

4: Compute new candidate parameter 

˜μ ∗ 

i+1 ← arg min H (c (pi (μ))) 

μ∈Pad 5: Evaluate f ˜μ ∗ ∗ 

i+1 and H f ˜μ i+1 

6: if H f ˜μ ∗ 

i+1

implementation of Algorithm 1 reached its final residual 

of 4.110 −7 after 4 hours and 4 iterations which results in 

a reduction of runtime of 15%. This speed up is induced 

by the number of evaluations of the fine model being 

reduced from 20 to 5. But the additional 54 evaluations of 

the coarse model in typically 189.6±2.4s during the two 

optimization processes undermines this large reduction. 

The differences in the curves resulting from inital and 

final parameters can be mainly related to the strong 

impact of the non–linearity of the equilibrium potential 

function U OCV(cs). Different inital Li + –concentrations 

cs,0 lead to a different working window of the curve, 

so that the shape changes. Additionally, by changing the 

initial amount of Li + , the available amount to move 

inside the system is changed. This intrinsic value can 

be estimated by the volume integral of the initial Li + – 

concentrations in the electrodes 

Ω εscs,0 dΩ. The limits 

of Li + –concentrations of each electrode – commonly 

referred to as full and empty, respectively – will result in 

the cell capacity. By modifying the cell capacity and the 

effective load, which can differ from the preset 0.2h−1 , 

the lower cut–off voltage is reached at different times. 

The model response and the reference data where padded 

with their final value, to match the length of one another. 

Because of the sharp voltage drop near the empty cell, 

the sensitivity of the objective function is very high. 

This simplifies finding the optimum and interpreting the 

results. 

The second task was to simulate a 100s short charge 

pulse of 0.5h−1 load to find the exchange current density 

and reaction rate μ = k = {ka,kc} in both anode 

Ωa and cathode Ωc starting from 50% SOC. As shown 

in [6] and confirmed by our own investigations, the 

exchange current density and reaction rate are showing 

high impact on the results. The reference value μ∗ in this 

case was set to 10−7 , 10−7 mol m−2s−1 , optimization 

was initialized at μ0 = 10−8 , 10−6 mol m−2s−1 – 

see Figure 3 for resulting voltage curves. In this case, 

stopping criteria were applied tighter as before, because 

of the smaller number of points in time of the problem: 

• absolute value of the function value 

H (f (μi))

TABLE II 

COARSE MODEL SPEED UP COMPARISON 

T model runtime model Total Speed 

evaluations runtime up 

1 2.95 ± 0.07s 549 (7) 1469s 1.3 

2 2.39 ± 0.05s 474 (7) 1241s 1.6 

5 2.04 ± 0.05s 474 (7) 1071s 1.8 

10 1.92 ± 0.04s 486 (7) 1038s 1.9 

20 1.87 ± 0.04s 549 (8) 1145s 1.7 

non–linear 11.8 ± 0.19s 165 1947s 1.0 

Speed up achieved by space mapping surrogate optimization of 

kinetic rate parameters k for different reassembly periods T 

compared to the non–linear case (last line). Model evaluations in 

parentheses (·) are non–linear model evaluations. 

TABLE III 

PROGRESS OF RESIDUALS DURING OPTIMIZATION 

H (f(μk)) 

i T =1 T =2 T =5 T =10 T =20 

0 9.566E-02 

1 5.159E-01 5.161E-01 5.159E-01 5.126E-01 4.261E-01 

2 5.276E-02 5.280E-02 5.287E-02 4.127E-02 4.179E-02 

3 2.414E-03 2.426E-03 2.439E-03 3.793E-03 3.802E-03 

4 5.399E-06 5.484E-06 5.648E-06 9.178E-06 1.049E-05 

5 3.229E-11 3.726E-11 4.589E-11 6.074E-11 3.258E-11 

6 7.749E-13 7.840E-13 7.796E-13 9.783E-13 1.035E-12 

7 — — — — 9.031E-13 

Residuals achieved by space mapping surrogate optimization of 

kinetic rate parameters k at iterations i for different reassembly 

periods T . 

Additionally, an optimum exists for some reassembly 

period T , which appears in the decrease of the speed 

up factor at some point. This is related to the additional 

iteration necessary to reach the target residual threshold. 

Yet, the performance of each iteration of the optimization 

procedure shows similar progression, as can be seen in 

Table III for different reassembly periods which strengthens 

the before mentioned assumption of adaptivity of the 

algorithm. 


This paper shows the application of the space mapping 

approach to speed up estimation of the parameters of a 

DFN motivated battery model. This is done by substituting 

model evaluations by the response of a fast surrogate 

model. Afterwards, the obtained parameters are mapped 

into the original model’s parameter space by an iteratively 

refined mapping function. 

We have validated the algorithm by applying it to 

two synthetic fitting problems, where parameters of a 

simulation are recovered starting from a different point 

in parameter space. The one dimensional quasi–stationary 

case results in a reduction of 15%, whereas in the two 

dimensional case shows 25% as compared to the direct 

optimization computation times. Further simplification of 

the coarse model increases the latter to 48%. 

There is another advantage that evolves out of the 

usage of a linear coarse model, which is the possibility 

to state the adjoint system, which can be used to estimate 

the cost functions’s exact gradient after only one 


additional evaluation instead of one per parameter using 

finite differences to approximate the gradient. This way 

of optimization seems to be prospective, for example, for 

estimating parameters of a real system or optimization 

of the battery itself, with much less effort and increased 

efficiency than by using direct methods. 


The authors gratefully acknowledge 

financial support from Climate- and 

Energy Fund “Klima- und Energiefonds” 

as part of the program New Energy 2020 

“NEUE ENERGIEN 2020” of the Federal 

Province of Styria/Austria for the project in which the 

above presented research results were achieved. 

REFERENCES 

[1] M. Doyle, T. F. Fuller, and J. Newman, “Modeling of galvanostatic 

charge and discharge of the lithium/polymer/insertion cell,” Jof 

The Electrochemical Society, vol. 140, no. 6, pp. 1526–1533, 

1993. 

[2] J. Newman and K. E. Thomas-Alyea, Electrochemical Systems, 

3rd ed. John Wiley & Sons, Inc., Hoboken, New Jersey, 2004. 

[3] P. M. Gomadam et al., “Mathematical modeling of lithium-ion 

and nickel battery systems,” J of Power Sources, vol. 110, no. 2, 

pp. 267–284, 2002. 

[4] S. Santhanagopalan et al., “Review of models for predicting the 

cycling performance of lithium ion batteries,” J of Power Sources, 

vol. 156, no. 2, pp. 620–628, 2006. 

[5] S. Santhanagopalan and R. E. White, “Online estimation of the 

state of charge of a lithium ion cell,” J of Power Sources, vol. 

161, no. 2, pp. 1346–1355, 2006. 

[6] S. Santhanagopalan, Q. Guo, and R. E. White, “Parameter estimation 

and model discrimination for a lithium-ion cell,” J of The 

Electrochemical Society, vol. 154, no. 3, pp. A198–A206, 2007. 

[7] S. Santhanagopalan et al., “Parameter estimation and life modeling 

of lithium-ion cells,” J of The Electrochemical Society, vol. 

155, no. 4, pp. A345–A353, 2008. 

[8] A. P. Schmidt et al., “Experiment-driven electrochemical modeling 

and systematic parameterization for a lithium-ion battery cell,” 

J of Power Sources, vol. 195, no. 15, pp. 5071–5080, 2010. 

[9] J. C. Forman et al., “Genetic identification and fisher identifiability 

analysis of the doyle-fuller-newman model from experimental 

cycling of a lifepo4 cell,” J of Power Sources, vol. 210, no. 0, pp. 

263–275, 2012. 

[10] J. Bandler et al., “Space mapping technique for electromagnetic 

optimization,” IEEE Trans. on Microwave Theory and Techniques, 

vol. 42, no. 12, pp. 2536–2544, dec 1994. 

[11] F. Pichler, “Anwendung der Finite-Elemente Methode auf ein 

Litium-Ionen Batterie Modell,” Master Thesis, University of Graz, 

2011. 

[12] I. J. Ong and J. Newman, “Double-layer capacitance in a dual 

lithium ion insertion cell,” J of The Electrochemical Society, vol. 

146, no. 12, pp. 4360–4365, 1999. 

[13] M. Cifrain et al., “Elektrochemisches Zellmodell,” publication 

planned. 

[14] C. G. Broyden, “A class of methods for solving nonlinear simultaneous 

equations,” Mathematics of Computation, vol. 19, no. 92, 

pp. 577–593, Oct. 1965. 

[15] M. H. Bakr et al., “An introduction to the space mapping 

technique,” Optimization and Engineering, vol. 2, no. 4, pp. 369– 

384, 2001. 

[16] O. Lass et al., “Space mapping techniques for a structural optimization 

problem governed by the p-Laplace equation,” Optimization 

Methods and Software, vol. 26, no. 4-5, pp. 617–642, 

2011. 

[17] E. Jones et al., “SciPy: Open source scientific tools for Python,” 

2001–. [Online]. Available: http://www.scipy.org/ 

[18] MATLAB, version 7.12 (R2011a). Natick, Massachusetts: The 

MathWorks Inc., 2011.


Large Scale Energy Storage with Redox Flow Batteries 

Piergiorgio Alotto, Massimo Guarnieri, Federico Moro and Andrea Stella 

Dipartimento di Ingegneria Industriale, Università di Padova, Via Gradenigo 6/a, 35131 Padova, Italy 

E-mail: name.surname@unipd.it 

Abstract— The expected expansion of renewable energy sources is calling for large and efficient energy storage systems. 

Electrochemical ones are considered the solution of choice in most cases, since they present unique features of localization 

flexibility, efficiency and scalability. Among them, Redox Flow Batteries (RFBs) exhibit remarkably high potential for several 

reasons, including power/energy independent sizing, high efficiency, room temperature operation and extremely long life. 

The most developed RFBs are the all-vanadium based ones (VRB), but other research programs are underway in many 

countries. They aim at substantial improvements which can lead to more compact systems, capable of taking the technology 

to a real breakthrough in stationary grid-connected applications, but which can also prove suitable for powering electric 

vehicles. This paper gives an overview of the RFB technology state-of-the-art, highlights its pros and cons, and indicates 

current research challenges. 

Index Terms— Electrochemical storage, redox flow batteries, vanadium flow batteries. 


Presently, renewable sources except hydroelectric, 

particularly solar and wind, provide roughly 4% of the 

worldwide electricity production, but they are expected 

to grow substantially in the near future (up to 26% by 

2030 [1]). 

In contrast with conventional electrical power plants, 

wind, solar, and other primary renewable sources are 

intermittent, because the generated electrical power 

depends on the time of the day and on the climatic 

availability of resources. The integration into the grid of 

such primary energy sources, each with different 

peculiarities, requires their careful design and control. 

Furthermore, traditional grids have not been designed for 

such operational conditions so that they are not always 

able to work satisfactorily when many renewable-source 

generators are connected. In fact, recent studies suggest 

that grids can become unstable if such sources provide 

more than 20% of the whole generated power without 

adequate energy storage. 

Thus, future power grids provided with relevant 

amounts of renewable sources call for adequate energy 

storage systems capable of storing production surplus 

during some periods and of contributing to face higher 

demand during others, while at the same time 

contributing to stabilizing the grid operation. Applied in 

this way, energy storage systems will allow to 

substantially undersize the primary power plants with 

respect to peak demand, since relevant quantity of power 

will be provided by the storage systems. 

Two main different application scenarios can be 

identified: i) “peak shaving” and “sag compensation” 

refer to charge/discharge cycles on short timescales (secmin) 

and are needed for grid stabilization; ii) “load 

leveling” concerns charge/discharge cycles on longer 

timescales (hour) and allow to improve load factor of the 

grid. 

Several recent surveys show that electrochemical 

storage systems will be the solution of choice for 

complementing intermittent photovoltaic and wind 

generation facilities with long-time-scale energy storage. 

In fact, such storage technologies feature site versatility, 

modularity, scalability, ease of operation, and no moving 

parts [2]. Worldwide important funding programs have 

been established for the scientific and technological 

Fig. 1: Discharge times vs. power for mainstream energy 

storage technologies 

development of innovative electrochemical storage 

systems. 

Among them, Redox Flow Batteries (RFBs) are 

particularly promising They have emerged in recent 

years as a very promising solution for stationary 

applications, in combinations with renewable sources, 

for applications such as peak shaving, sag compensation, 

and load leveling [3,4,5]. The reason for their potential 

success depends on the fact that, with respect to 

competing technologies, they cover a very wide range of 

discharge times (energy) and powers, as shown in Fig. 1. 

RFBs exploit reduction and oxidation (redox) reactions 

of ion metals (i.e. electrochemical species) solved in 

aqueous or nonaqueous liquids. The storage of these 

solutions is performed in external tanks, potentially of 

very high capacity, and they are circulated into the RFB 

battery when power exchange is required. The main 

appealing features of RFBs are: scalability and 

flexibility, independent sizing of power and energy, high 

round-trip efficiency, high depth of discharge (DOD), 

long durability, fast response times, reduced 

environmental impact, and absence of expansive noblemetal 

based catalyzers. 

In the rest of this paper, the most important features of 

RFBs will be presented together with an overview of the 

current state-of-the art of commercial systems. 

Furthermore, current research and development issues of 

RFB systems will be highlighted.

II. RFB BASICS AND FEATURES 

A. RFB basics 

RFB cells operate on the basis of electrochemical 

reduction and oxidation reactions of two liquid 

electrolytes containing ionized metals [6]. One electrode 

performs the reduction half-reaction of one electrolyte 

that releases one electron and one ion while the other 

electrode performs an oxidation half-reaction that 

recombines them into the other electrolyte (Fig. 2). Ions 

can then migrate from one electrode to the other (from 

anode to cathode) through a membrane which can not be 

crossed by electrons, which are instead forced to pass 

through the external circuit thus exchanging electric 

energy. Typical RFB cells must operate at room 

temperature in order to keep the solutions in the liquid 

phase. This condition implies that the ion-conducting 

membrane between the two electrodes is a polymeric 

one. Both half-cells are connected to external storage 

tanks where the solutions are stored and circulated by 

means of two suitable pumps. In order to design such a 

storage system based on a RFB, expertise in 

electrochemistry, chemistry, chemical engineering, 

electrical engineering, power electronics, and control 

engineering are required, thus calling for highly 

multidisciplinary research and development teams. 

B. RFB features 

RFBs can be considered as a particular type of Fuel 

Cell (FC), since they can generate electrical power as 

long as they are fed with fuel (in this case the electrolyte 

solutions), and indeed the cell structure is very similar to 

the one of Polymer Electrolyte Membrane Fuel Cells 

(PEMFCs). Another similarity between RFBs and FCs is 

that energy is stored in components, the tanks, which are 

physically separated from the cells themselves, were 

power conversion takes place. 

Independent sizing of power and energy in typical 

RFB systems is therefore possible and this feature allows 

for virtually unlimited capacity simply by using larger 

storage tanks, without altering the battery itself or 

Fig. 2: Schematic of a RFB energy storage system: RFB 

stack and electrolyte tanks are separated 


Fig. 3: Schematic of a typical RFB cell structure with 

MEA (membrane-electrode assembly) and flow-by 

solution distribution in bipolar plates with parallel-channel 

layout (gaskets are not shown) 

power conversion devices. Compared to other 

electrochemical systems that incorporate energy and 

power in a single device, RFBs are usually more 

advantageous when generation for 4-6 hours or more at 

maximum power is required. Furthermore, RFBs can 

also be fully charged or discharged and left in such 

conditions for long periods with no negative effects. 

RFB cells consist of a sandwiched structure composed 

of electrodes and interposed proton conducting 

membranes that are very similar to the Membrane 

Electrode Assembly (MEA) typical of PEMFCs (Fig. 3). 

The electrolyte solutions reach the electroactive sites 

within the electrodes by flowing through a porous 

structure consisting of materials such as carbon felt. In 

contrast with FC storage systems, which require a 

specific device, i.e. the electrolyzer, for converting 

electrical energy into hydrogen and oxygen, RFBs can 

perform both conversions, from electrical to chemical 

and from chemical to electrical, in a single device. 

A second advantage of RFBs with respect to FCs is that 

their fuels are not hazardous gases such as hydrogen and 

oxygen, but much less dangerous electrolyte solutions, 

which makes handling and storage much simpler and 

cheaper. As shown in Fig. 2, only two tanks and two 

pumps are required for these functions. 

Moreover, RFBs operate by changing the metal ion 

valence, but the ions themselves are not consumed. This 

feature allows extremely long cyclic service with very 

low maintenance. 

The optimal electrolyte temperatures are in a narrow 

range, roughly between 15°C and 35°C, and outside this 

range unwanted side effects such as solution 

precipitation may occur. On the other hand, the 

temperature of the battery can be controlled rather easily 

by appropriately regulating the electrolyte flow. The 

control RFBs is also relatively easy: in fact, the cell 

voltage allows to monitor easily the state of charge 

(SOC) of the battery and, at the same time, very deep 

discharges are possible because no damage occurs to the 

morphology of the cell with such operations. 

Furthermore, self-discharge is prevented by the 

separation of the two electrolytes in two different

circuits. The very fast reaction kinetics provides very 

fast response times and high overloading is tolerable on 

short time scales. 

On the other hand, at present even the most advanced 

RFBs have relatively low power and energy densities 

compared to other competing electrochemical storage 

technologies. Consequently, RFBs tend to have large 

active areas and ion conducting membranes and 

therefore their overall size is usually bulky, making them 

unsuitable for mobile applications. Also, the large active 

areas cause high transverse gradients of the solutions 

that feed the electrochemically active sites, particularly 

when operating at high power and with high flows. This 

causes the current density to be far from uniform on the 

active areas, with average values quite lower than the 

maximum ones. 

III. RFB TECHNOLOGIES 

A. Fe-Cr systems 

The first commercial RFBs were of the Fe-Cr type, 

featuring open circuit voltages of about 1 V for the 

single cell. Test systems in the range of 10-60 kW were 

produced in the late 1980s by several Japanese 

companies including: Mitsui Engineering and 

Shipbuilding Co. Ltd, Kansai Electric Power Co. Inc, 

and Sumitomo Electric Industries Ltd (SEI). Beside 

relatively low energy density, the main drawbacks of 

such systems included: slow reaction of the Cr ion, 

membrane aging and cell degradation due to the mixing 

of the two ions. Due to these problems, Fe-Cr cells are 

inferior to VRBs, so that they were abandoned after the 

emerging of the latter. 

B. VRB systems 

VRBs (Vanadium Redox Batteries), also called allvanadium 

RFBs, are currently the most successful RFB 

technology, the only one that has reached substantial 

quite commercial maturity. Such systems use only 

vanadium, dissolved in aqueous sulfuric acid (~5 M). A 

positive feature with respect to other RFBs is that, since 

they use the same metal on each electrode, the electrodes 

and membrane are not cross-contaminated, preventing 

capacity decrease and providing longer lifetimes. 

Exploiting the ability of vanadium to exist in solution 

in four different oxidation states, vanadium II-III 

(bivalent-trivalent) is used at one electrode while 

vanadium IV-V (tetravalent-pentavalent) is used at the 

other one. The electrochemical half-reactions are: 

positiveelectrode 

VO 2+ charge 

+ H2O↽⇀ + + − 

VO2 + 2H + e 

discharge 

negativeelectrode 

V 3+ + e − charge 

↽ ⇀V 2+ 

discharge 


(1) 

Fig. 4: Polarization curve of a RFB 

During charge, tetravalent vanadium ions VO2+ are 

oxidized to pentavalent vanadium ions VO2 + at the 

positive electrode, while trivalent ions V3+ are reduced 

to bivalent ions V2+ at the negative electrode. The 

hydrogen ions 2H + created at the positive electrode flow 

to the negative one through the membrane thus 

maintaining the electrical neutrality of the electrolytes. 

The theoretical open circuit voltage (OCV) of VRB cell 

is Eo =1.26 V at 25°C, but, in fact, real cells exhibit 

Eo =1.4 V in operating conditions. The cell voltage v in 

load operation differs from Eo due to the activation 

overpotentials η of the electrodes which are modeled by 

the Butler-Volmer equation: 

c 

j = j r (0,t) 

o exp 

cr * 

αF 

RT η 

⎛ ⎞ 

⎝ 

⎜ 

⎠ 

⎟ − c ⎡ 

p(0,t) ⎛ (1− α )F ⎞ ⎤ 

⎢ 

exp η 

cp * ⎝ 

⎜ 

RT ⎠ 

⎟ ⎥ 

⎣⎢ 

⎦⎥ 

where j is the current density at the electrode, jo is the 

exchange current density, ci are the species 

concentrations at the electrochemical activity sites of the 

reagents and products indicated in (1) (i = r regents, 

i = p products), α is the transfer coefficient (with a value 

around 0.5), F is the Faraday constant, R is the gas 

constant, and T is the absolute temperature. The ci/ci 

fractions express the dynamic reduction of the 

concentrations normalized to steady state equilibrium 

values. 

According to (2) v =Eo – η is higher than Eo in the 

charge phase, i.e. where the current density is j 0 when electric power 

is released. jo is a parameter which depends on the 

reactions and on the physical-chemical structure of the 

electrodes, and is crucial for the cell operation, since the 

higher jo the lower η for a given j. In fact, the activation 

overpotentials are the major culprits for the cell’s 

internal losses at lower current densities (with ci/ci≅1, 

Fig. 4). Thus, increasing jo by means of appropriate 

electrode designs allows to get performance 

improvements and higher round trip efficiency. jo can be 

increased with higher concentrations, lower activation 

(2)

Fig. 5: Schematic of a RFB stack with side fluid feedings – 

series of about 100 cells with active areas as large as 

0.4x0.4 m 2 are usual 

barriers (i.e. higher activity provided by efficient 

catalysts), and larger effective active areas, achievable 

by means of highly porous electrodes (e.g. 

nanostructured materials). 

At medium current densities internal losses mainly 

depend on the ion conducting membrane that separates 

the electrodes (Fig. 3). The material of choice is a 

perfluorosulfonic acid polymer that allows, if properly 

hydrated, the transport of ions by binding cations to its 

sulfonic acid sites. It is a rather expensive material 

patented by DuPont and commercially available under 

the name Nafion. Electrically the membrane behaves as 

a linear resistor, if temperature and hydration are kept 

constant. 

At higher current densities the losses are dominated by 

transport phenomena in the electrode diffusion layers, 

which dramatically reduce the concentrations (ci/ci

system so far, intended for smoothing power output 

fluctuations at the Subaru Wind Villa Power Plant which 

is rated at 30.6 MW, is a 4 MW / 6 MWh installation 

built by Sumitomo Electric Industries (SEI), Japan, for J- 

Power in 2005. The system consists of 4 banks, each of 

24 stacks and rated at 1 MW (which can be overloaded 

up to a maximum of 1.5 MW). Individual stacks consist 

of 108 cells, with a rated power of 45 kW each. During 

over 3 years of operation, the system completed more 

than 270,000 complete cycles, thus demonstrating its 

reliability. 

The abovementioned SEI is one of the largest 

manufacturers of VRB systems for the smoothing and 

leveling of the fluctuating power generated by wind 

farms. Most of such systems have been built by SEI and 

later by VRB Power Inc., based in Vancouver, CA, 

which acquired SEI patents around 2005. In 2009, all 

vanadium redox battery assets of VRB Power Inc. where 

acquired by Prudent Energy, controlled by investors 

from China and the U.S.A., in a plan of business 

expansion in China and abroad. Further important efforts 

in the development of commercial RFB technologies in 

China are those of the Chengde Wanlitong Industrial 

Group. The reason of this interest lies in Chinese plans 

to expand the exploitation of intermittent renewable 

energy sources, especially wind. In fact, wind power 

production in the country is expected to rise from about 

20 GW in 2010 to 100 GW in 2015, and almost $50 

billion per year are expected to be invested in power grid 

improvements in the next decade to handle this 

increasing amount of energy production from 

intermittent sources. 

Significant developments are also taking place in other 

Asian countries, e.g. Cellennium Company Ltd. of 

Thailand produces VRB systems under license, while 

Samsung Electronics Co. Ltd. in South Korea is engaged 

in the development of RFBs with nonaqueous 

electrolytes. 

Further interesting developments are taking place in 

Australia, where V-Fuel Pty Ltd is pursuing innovative 

V-Br technology in cooperation with the University of 

New South Wales (UNSW). Other Australian companies 

working on RFBs, are ZBB Energy Corp. and Redflow 

Ltd., both involved in the development and installation 

of Zn/Br2 batteries. 

In the U.S., the Department of Energy (DoE) launched 

an RFB development program which identified Ashlawn 

Energy, LLC for the design of a 1 MW / 8 MWh VRB 

test plant while Primus Power Corp. was funded to 

develop a 25 MW / 75 MWh system based on Zn/Cl2 

RFBs. Premium Power Corp. is also developing Zn/Br2 

batteries. 

In Europe, Renewable Energy Dynamics (RED-T), 

Ireland, Cellstrom GmbH, Austria, and RE-Fuel 

Technology Ltd., UK, are some of the most active 

companies developing and producing VRB systems. 

High-energy density innovative RFBs are also being 

investigated in Germany, where the Fraunhofer- 

Gesellschaft is researching nonaqueous electrolytes, and 

in the UK where Plurion Ltd is working on Zn-Ce 

systems. 

Overall, since the market for smart grid technologies is 

expected to grow significantly worldwide in the near 


future, the market for VRB systems, which is already 

starting to flourish, is also expected to expand 

vigorously. 

V. RESEARCH ISSUES 

In spite of the previously described initial commercial 

successes, RFB technology has yet to witness a complete 

technical and commercial breakthrough and substantial 

R&D programs are still required in order to fully unleash 

its industrial potential. The next generation of systems, 

expected within the next 5 years, will be even more 

economically competitive and will be able to provide the 

capital and lifecycle cost reductions that are essential for 

widespread commercial success. 

The basis for more compact and efficient systems, 

exhibiting higher power and energy densities will be 

provided by non-aqueous electrolytic solutions and by 

improved electrode activity. Improved electrolytes will 

also allow to expand the operation temperature range. 

For example, the nonaqueous 2MW/20MWh RFB 

system under development at the Fraunhofer Institute 

will consists of 8 blocks of 7 stacks, with 100-cell 

stacks, and will have an output of 2 kV, 1 kA, while 

being fed from 2x300 m3 tanks. Further improvements 

will come also from nanostructured electrodes, currently 

under development, with increased effective surface area 

and hence improved exchange current density. 

In next generation systems, the currently common and 

expensive Nafion ion conducting membrane will be 

substituted with alternative ones having significantly 

reduced cost and, at the same time, lower ohmic losses 

in the cells. Incidentally, further material cost reduction 

will also be provided by higher power densities, through 

more compact designs. 

Apart from the above mentioned developments which 

involve mainly materials science and basic chemistry, 

important engineering efforts are being aimed at system 

scale-up and at the structural and operational 

optimization of flow geometries, state-of-charge 

monitoring and supervisor systems. Numerical modeling 

and simulation are instrumental in improving the current 

systems which are far from optimal in many respects. 

Multi-scale, multidimensional, multi-physic, both 

steady-state and dynamic, models can accurately 

simulate the behavior of the whole system and its 

components and thus speedup the development of more 

efficient components and systems. Many modeling 

problems encountered in RFB systems are similar to 

those posed by direct alcohol fuel cells which also 

consist of the same basic building blocks (MEA-based 

cells, bipolar plates and stacks) and are also fed with 

liquid solutions instead of gases, so that some of the 

numerical tools developed in that context [10] may be 

adapted to the simulation of RFB systems. Sophisticated 

modeling tools are also aimed at designing advanced 

bipolar plates with either flow-by or flow-through 

diffusion of the electrolytic solutions, were the aim is to 

minimize transverse gradients and, at the same time, to 

reduce longitudinal conductance for lowering the shunt 

currents. Advanced computational techniques are needed 

to deal with the very challenging numerical problems 

arising from cell elements which exhibit multi-physic

material behavior and high aspect ratio geometries 

[11,12]. 

In the area of controls engineering, advanced control 

systems will provide automatic electrolyte rebalancing 

and capacity correction and will possibly allow the 

remote operation of large RFB systems. Optimized 

electrolyte flow-rates will also minimize pumping 

energy requirements, which are one of the main factors 

affecting the overall efficiency (together with shunt 

currents and internal cell losses). Such control systems 

will eventually cope with the conflicting requirements 

arising from the strong dependence of the cell voltage 

vs. current polarization curve on the solution flow-rates. 

As far as the electrical interface of RFB systems is 

concerned, modeling, simulation and optimization are 

aimed at designing supervisor and control sub-systems 

with proper feed-back loops and reduced response times 

which are required to assure improved performance for 

peak shaving, sag compensation and load leveling in the 

smart-grid context. Flexible solutions for interfacing 

both the DC renewable energy sources and AC grid and 

load can be obtained with DC/DC converters coupled to 

inverters. Non-linear control techniques of the inverter 

can allow RFB systems to provide active as well as 

reactive power to the smart-grid connected loads. The 

success in designing such power management subsystems, 

including both the DC/DC converter and the 

inverter, strongly depends on the accuracy in modeling 

the various components and the whole system. 

Further research is also needed for optimizing the 

technological solutions from the economical (operating 

earning and savings arising from the RFBs operation) 

and environmental (primary energy savings, carbon 

dioxide emission reductions) point of view. The results 

of these analyses will allow assessing the viability of 

RFB technologies within the context of modern energy 

hubs. 

All the above described scientific challenges raised by 

RFBs require strongly interdisciplinary development 

programs and collaborative efforts among researchers 

with different and complementary expertise. If such 

efforts are successful the next generation of RFB 

systems will be low cost, highly efficiency and durable, 

and thus be suitable for large-scale industrial 

exploitation, overcoming the limitations of more 

conventional systems. 

Finally, more compact and more flexible RFB systems, 

such as the ones mentioned above, may one day become 

suitable for powering some classes of electric vehicles. 


Redox flow batteries (RFBs) are already a promising 

energy storage technology and first generation systems, 

based on all-vanadium solutions, have already been 

successfully demonstrated in test installations 

worldwide, and their commercial exploitation is 

undergoing. The next generation of systems, with 

increased power and energy densities, are currently 

under development, but further progresses in 

electrochemical materials and system engineering are 

expected to produce the final technical and commercial 

breakthrough. RFB systems are expected to become a 


key technology for stationary smart-grid-oriented 

applications supporting the load leveling and peak 

shaving of intermittent renewable energy sources. Future 

high-density systems may also become suitable for some 

automotive applications. 

REFERENCES 

[1] European Commission, “Proposal for a COUNCIL 

DECISION 

establishing the Specific Programme Implementing Horizon 

2020 - The Framework Programme for Research and 

Innovation (2014-2020), COM(2011) 811 final, 2011/0402 

(CNS). 

[2] B. Dunn, H. Kamath, J.Tarascon, “Electrical Energy 

Storage for the Grid: A Battery of Choices”, Science, 334, pp. 

928-935, 2011. 

[3] Z. Weber, M. M. Mench, J. P. Meyers, P. N. Ross, J. T. 

Gostick, Q. Liu, “Redox flow batteries: a review”, J. Appl. 

Electrochem. 41, pp. 1137-1164, 2011. 

[4] T. Shigematsu, “Redox Flow Batteries for Energy Storage”, 

SEI Technical Review, 73, pp. 4-13, 2011. 

[5] C. Ponce de León, A. Frías-Ferrer, J. González-García, 

D.A. Szánto, F. C. Walsh, “Redox flow cells for energy 

conversions”, J. Power Sources, 160, pp. 716-732, 2006. 

[6] C. Menictas, M. Skyllas-Kazacos, “Performance of 

vanadium-oxigen redox fuel cell”, J. Appl. Electrochem., 41, 

pp. 1223-1232, 2011. 

[7] M. Skyllas-Kazacos, G. Kazacos, G. Poon, H. Verseema, 

“Recent advances with UNSW vanadium-based redox flow 

batteries”, Int. J. Energ. Res., 34, pp. 182-189, 2010. 

[8] Kaneko H, Negishi A, Nozaki K, Sato K, Nakajima M 

(1992) Redox battery. US Patent 5318865. 

[9] C. Menictas, M. Skyllas-Kazacos, “Vanadium-oxygen 

redox fuel cell”, Final report. SERDF Grant, NSW Department 

of Energy, 1997. 

[10] V. Di Noto, M. Guarnieri, F. Moro: “A Dynamic Circuit 

Model of a Small Direct Methanol Fuel Cell for Portable 

Electronic Devices”, IEEE Tran.s Ind. Electronics, Vol. 57, N. 

6, pp. 1865-1873, 2010. 

[11] P. Alotto, M. Guarnieri, F. Moro, A. Stella: “A Proper 

Generalized Decomposition Approach for Fuel Cell Polymeric 

Membrane Modelling”, IEEE Trans. Mag., Vol. 47 No. 5, pp. 

1462-1465, 2011. 

[12] P. Alotto, M. Guarnieri, F. Moro, A. Stella: “Multi-physic 

3D dynamic modelling of polymer membranes with a proper 

generalized decomposition model reduction approach”, 

Electrochimica Acta, pp. 250-256, 2011. 

[1]


Model Order Reduction via Proper Orthogonal 

Decomposition for a Lithium-Ion Cell 

B. Suhr∗ , J. Rubeˇsa∗ ∗Kompetenzzentrum - Das Virtuelle Fahrzeug Forschunggesellschaft mbH (VIF), Graz, Austria 

E-mail: bettina.suhr@v2c2.at 

Abstract—The simulation of lithium-ion batteries is a challenging research topic. Since there are many electrochemical 

processes involved in dis-/charging, models which aim to include these processes are in general complex and therefore slow. 

For many tasks, e.g. in optimization, a repeated solution of a model is necessary. In this paper a speed up in simulations, 

with acceptable error in results, is obtained by combining proper orthogonal decomposition with empirical interpolation 

method. We report a speed up factor between 10 and 15. 

Index Terms—electrochemical model, empirical interpolation method, model reduction, proper orthogonal decomposition 


The accurate and fast simulation of lithium-ion batteries 

is of a growing interest in the automotive industry. 

As fossil fuels are limited, more and more research is 

conducted on electric, especially on hybrid cars. Here, the 

quality and the speed of the battery simulation is a crucial 

point. Often battery models are simplified strongly, in 

a physical meaning, to obey the need for speed of on 

board usage or optimization purposes. In contrary, here a 

speed up in simulation will be gained by using model 

reduction via proper orthogonal decomposition (POD) 

combined with a fast evaluation of nonlinearities, the 

empirical interpolation method (EIM). 

Cai and White in [4] applied POD method to a battery 

model, but the mayor nonlinearity of the system was 

assumed to be constant. Starting from the full model and 

very fine discretization in space, a speed up factor of 4 

was obtained. In their work a comparison between full 

and reduced for only constant discharge simulations were 

done. 

We follow the work introduced by Lass and Volkwein 

in [7] where POD and EIM were applied to the battery 

model of Wu-Xu-Zou [10]. We use this procedure and 

apply it on more general but more complex battery model 

of Cifrain [5]. In our work, as in the work of Lass, 

no simplifications of the battery model, as mentioned 

previously, are necessary and a speed up factor of 15 

was obtained for constant discharge simulations. 

The paper is organized in the following manner: In 

Section II the nonlinear parabolic dynamical system that 

describes considered battery model is formulated. Section 

III is devoted to the reduced order model (ROM) utilizing 

proper orthogonal decomposition (POD) method. 

We describe the method of POD in general and its 

application to the battery system. Moreover, the empirical 

interpolation is introduced. In Section IV numerical 

results are presented. Finally, in Section V conclusions 

are drawn and an outlook on future work is given. 

II. BATTERY MODEL 

The battery cell consists of two electrodes, an anode 

and a cathode, and a separator between them. Each 

electrode consists of particles and an electrolyte, while 

in the separator we consider only the electrolyte. 

The mathematical model described in [5] and used 

here is an electrochemical model similar to the well 

known model of Newman [6]. It is a coupled dynamical 

system of four nonlinear partial differential equations. 

The system variables are potentials and concentrations 

for the electrolyte, φl,cl, for the cathode, φsc,csc, and 

for the anode φsa,csa. All state variables describing the 

potential and the liquid concentration variable are one 

dimensional system variables. Those four variables are 

time t ∈ [0,T], T ∈ R, and space x ∈ Ω dependent, 

where Ω ⊂ R. Two remaining variables, the variables 

for the solid concentration are two dimensional variable, 

i.e., cs := cs(x, r, t) ∈ Λ × (0,T) where Λ ⊂ R 2 . Such 

model is also referred to as the pseudo-two-dimensional 

model; see Figure 1. Considered battery model is given 

in the following way: 

∂ (φs − φl) 

CDSAi 

−∇·(σs∇φs) =−AiθjBV in Ω 

∂t 

′ 

∂ (φs − φl) 

−CDSAi 

−∇ · 

RT 

zF 

∂ (ɛlcl) 

∂t 

∂t 

κl(cl)t + 

l 

 

−∇ · Dl ∇cl + zF 

∂cs 

∂t 

= 1 

r 2 

−∇·(κL(cl)∇φl)+ 

(1a) 

 

1 

∇cl = AiθjBV in Ω (1b) 

cl 

∂ (φs − φl) 

− CDSAi 

+ 

∂t 

RT μlcl∇φl 

 

= Aiθ 

F jBV in Ω (1c) 

 

∂ 

Dsr 

∂r 

2 fRK(cs) ∂cs 

 

in Λ (1d) 

∂r 

strongly coupled with 

⎧ 

⎪⎨ 

⎪⎩ 

 

αAzF(φs−φ 

zFkθ exp 

l−UOCV (cs)) 

+ 

RT 

 

−(1−αK )zF(φs−φ 

−zFkθ exp 

l−UOCV (cs)) 

RT 

in Ω ′ 

0 in Ωs 

jBV =

where the spatial sub-domains are introduced 

as Ω = Ωc∪ Ωs ∪ Ωa, Ω ′ Λa 

= Ωc∪ Ωa, 

=Ωa × [0,Ra] ⊂ R2 , Λc =Ωc × [0,Rc] ⊂ R2 , 

Λ=Λa∪ Λc and Ra,Rc ∈ R. See Section VI for a 

list of symbols. The system is initialized with constant 

values which correspond to an equilibrium state, i.e., 

jBV =0. The boundary conditions are all homogeneous 

Neumann conditions at the external boundaries and 

continuous flux conditions at internal boundaries 

(between electrodes and separator). Exceptions are 

the solid potential where either a current is specified 

−σS ∂φS 

 

 

∂x = − Γa 

I(t) 

 

or a voltage φS 

= U(t). Also 

Acell Γa 

the solid concentration has a non zero boundary condition 

DSfRK(cS) ∂cS 

 

∂r = − j∗ BV 1 ∂(φS−φL) 

F − F 

CDS ∂t . 

r=Rp 

The nonlinear initial value problem (1) is discretized 

in space with the Finite Element (FE) method, in time 

using the implicit Euler method and linearized with the 

damped Newton method. The details on numerics, initial 

values and boundary conditions one can find in [8]. After 

obtaining the numerical solution of the model, the need 

for model reduction and speed up has occurred. 

 

 

 

Fig. 1. Pseudo-two-dimensional model. 

III. MODEL REDUCTION 

The reduction of linear dynamical systems is a classical 

research topic and there exist several well known 

methods for this task, e.g. balanced truncation, Krylov 

methods, reduced basis and POD. Detailed information 

one can found in the standard literature or in [1], [3] 

and [9]. A relatively new area in research is the model 

reduction of nonlinear dynamical systems of equations. 

A. The POD method. 

POD is a commonly used model order reduction technique, 

when repeated simulations are to be conducted. 

Relevant information is extracted from snapshots generated 

with the full model and saved into the POD basis. 

Using the POD basis a ROM will be build and for all 

further simulations the ROM is used. 

Computation of the POD basis: The solutions yj ∈ 

Rm , j =1,...,n, of the full system we refer to as 

snapshots and we define the following: 

Y := [y 1 , ..., y n ] ∈ R m×n . 

 


The goal of the POD is to find l ≤ d = dim span (Y ) ≤ n 

orthonormal vectors {Ψi} l i=1 in Rm that minimize the 

cost function and does the best approximation of Y , i.e., 

J(Ψ1,...,Ψl) = 

n 

αjyj − 

j=1 

l 

〈yj, Ψi〉XΨi 2 , (2) 

i=1 

where x = √ x T x is the Euclidean norm, αj ≥ 0 are 

the weights and 〈x, y〉 L 2 is the inner product. Further, 

from the Lagrange functional: 

L(Ψ1,...,Ψl,λ11,...,λll) =J(Ψ1,...,Ψl)+ 

l 

+ λij(〈Ψi, Ψj〉X − δij) , 

i,j=1 

with the Kronecker symbol δij, we obtain two necessary 

optimality conditions: 

n 

1) αjyj〈yj, Ψi〉X = λiiΨi ,λij =0for i = j 

j=1 

2) 〈Ψi, Ψj〉X = δij ,i,j=1,...,l. 

The second condition is satisfied under the assumption 

that the vectors {Ψi} l i=1 are orthonormal and the first 

condition is equivalent with 

YBY T MΨi = λiΨi , i =1,...,l, (3) 

where M ∈ Rm×m is the mass matrix from the FE 

simulation, λi = λii, B := diag(α1,...,αn), α1 = δt1 

2 , 

αj = δtj+δtj−1 

2 ,j=2,...,n− 1 and αn = δtn 

2 . There 

exist more possibilities to solve (3) and we will consider 

the following approach. 

a) K-Ansatz (Covariance matrix): By setting ¯ Y := 

M 1 

2 YB 1 

2 we can define the symmetric matrix K, i.e., 

K := ¯ Y T ¯ Y = B 1 

2 Y T MYB 1 

2 ∈ R n×n . 

After the application of the eigenvalue decomposition on 

K we obtain the following relation: 

Kvj = λjvj , j =1,...,n. 

The n eigenvalues of the matrix K we denote by λj 

and the eigenvectors by vj. We sort the eigenvalues in 

decreasing order, i.e., λ1 ≥ λ2 ≥ ... ≥ λl ≥ ... ≥ λn, 

and we cut the first l eigenvalues such that following 

holds: 

n 

λj ≤ tolerance , 

j=l+1 

since the error of the cost function (2) can be calculated 

by J(Ψ1,...,Ψl) = n j=l+1 λj. Hence, the choice of 

l strictly depends on the eigenvalues λj, j =1,...,n 

and how fast same decay. In Figure 2, the computed 

eigenvalues are shown which are scaled by trace of matrix 

K. 

The POD basis consists of l functions: 

Ψi(x) = 1 

n 

√ α 

λi 

1 

2 

j (¯vi)jy j (x) ∈ R m , i =1,...,l. 

j=1

Fig. 2. Example plot of decaying eigenvalues. 

Denoting the FE basis functions as ϕ, the snapshots 

yj have the standard Galerkin representation: yj 

= 

m 

k=1 yj 

kϕ(x). This directly yields: 

m 

Ψi(x) = ψ i kϕ(x), ψ i k = 1 

n m 

√ α 

λi 

1 

2 j 

j (¯vi)jy k . 

k=1 

j=1 k=1 

For later use we define (Ψi)k := ψ i k , Ψi ∈ R m ,i = 

1,...,l, and the matrix ˆ Ψ:=[Ψ1, ..., Ψl] ∈ R m×l . 

B. Application of POD to the battery model. 

The POD basis for the battery model is calculated 

for each variable separately. The snapshots of the full 

FE solutions are split up in the data for each of the 

six variables φl,cl,φsc,φsa,csc,csa and the six matrices 

of snapshots, Yi ∈ RNi×n , i =1,...,6, are obtained. 

Here n is the number of time steps and Ni is the 

number of degrees of freedom in the FE Ansatz for the 

corresponding variable, i.e., m = 6 i=1 Ni. With these 

prerequisites the computation of the POD basis for each 

of the six variables can be carried out, as described above. 

The differences in the dimension of the domains of the 

six variables, cause no further changes. 

Before we describe how the reduced order model can 

be build we need to give a quick overview on how the 

full FE system is solved and to introduce some notations. 

As the FE method is well known we will not give details 

about how one can transform the battery system (1) in 

its weak form and apply the Galerkin method on it. 

The system, which is discretized in space, is still 

continuous in time and has the following form: 

M ˙u + K(u)u = f(u) , (4) 

where u := [φl,cl,φsc,φsa,csc,csa] T , M is the mass 

matrix, K(u) is the stiffness matrix and f(u) is the right 

hand side of the system. Please note that we solve all 

six equations simultaneously, therefore all matrices are 

block matrices. For time discretization the implicit Euler 

method is applied on (4) and we use 

 

1 

M + K(uk+1) uk+1 − f(uk+1) = 

Δt 1 

Δt Muk , (5) 


in every time step k = 1, 2, ..., n. As this system is 

fully discretized but still nonlinear, we apply now the 

Newton method in order to linearize the system (5). After 

introducing the following notation: 

 

1 

A(uk+1) := M + K(uk+1) 

Δt 

we can rewrite equation (5) and define the operator G as 

and F := 1 

Δt Muk , 

G(uk+1) :=A(uk+1)uk+1 − f(uk+1) − F =0. (6) 

Applying the Newton method we end up calculating the 

Newton step 

Δu = −[JG(u i k+1)] −1 G(u i k+1) , (7) 

in every time step k until the predefined stopping criterion 

is met for the sequence {ui k }∞i=0 , ui+1 

k+1 = ui k+1 +Δu. 

The derivative JG is calculated using the derivatives of 

K and f such that 

JG(uk+1) = 1 

Δt M + JK(uk+1) − Jf(uk+1). (8) 

1) Reduced order model for the battery: Next step 

is to build the reduced order model using the POD 

basis. We point out that it is necessary to formulate 

non homogeneous Dirichlet boundary conditions with a 

penalty method. 

To obtain the reduced order model the Galerkin method 

is used with the POD basis functions, instead of the FE 

basis functions as before. We denote our calculated POD 

basis for each of the six variables with: 

ˆΨi ∈ R Ni×li 6 

, i =1,...,6 , lPOD = li. 

i=1 

In the case of one single equation the mass matrix M of 

the full system and the mass matrix ˆ M of the reduced 

system are connected via the POD basis ˆ Ψ as follows: 

ˆM = ˆ Ψ T M ˆ Ψ , M ∈ R m×m , Mˆ l×l 

∈ R . (9) 

This can be adapted to a system of equations using the 

block structure of the mass matrix. The mass matrix 

M ∈ R m×m in (4) is a block matrix with the following 

structure: Mij ∈ R Ni×Nj , i,j =1,...,6. The corresponding 

matrix ˆ M ∈ R lPOD×lPOD , in the reduced order 

model, can be obtained as follows: 

ˆMij = ˆ Ψ T i Mij ˆ Ψj , i,j =1,...,6 , 

where ˆ Mij ∈ R li×lj for i, j =1,...,6. For the right 

hand side in (5) it holds analogously: 

ˆFi = ˆ Ψ T i Fi ∈ R li , i,j =1,...,6 , 

where Fi ∈ R Ni , F =[F1, ..., F6] T ∈ R m and ˆ F = 

[ ˆ F1, ..., ˆ F6] T ∈ R lPOD . To improve readability we use 

from now on L(M) := ˆ M and L(F ):= ˆ F . 

There are many hidden nonlinearities in equations (7) 

which have to be evaluated for each time and Newton 

step in the full FE dimension m. This can be accessed 

via the relation (uk+1)i = ˆ Ψi (ûk+1)i.

The corresponding equations to (6) and (8), for the 

reduced order model, are 

L(G(ûk+1)) :=L(A( ˆ Ψûk+1))ûk+1 −L(f( ˆ Ψûk+1)) 

−L(F ) (10a) 

L(JG(ûk+1)) := 1 

Δt L(M)+L(JK( ˆ Ψûk+1)) 

−L(Jf( ˆ Ψûk+1)) (10b) 

then the Newton step and the iterations are defined by 

Δû = −[L(JG(ûk+1))] −1 L(G(ûk+1)) , (11a) 

ûk+1 = Luk +Δû. (11b) 

C. Application of EIM 

The evaluation of nonlinearities in (10) is very slow. 

EIM can be used to reduce number of necessary evaluations 

of these nonlinearities, see [2] for details on 

the method. Let g be a given nonlinear function, which 

can be evaluated point wise. Needed are snapshots of 

this function with fast decaying eigenvalues. For the 

basis construction a greedy algorithm for the biggest 

residual is used, which stops at a given tolerance. The 

returned basis, Υ=[υ1,...,υlEIM ], has lEIM entries 

and also the indices for the evaluation of the nonlinearity 

p =[p1,...,plEIM ] are given. The nonlinear function g 

is approximated as follows: 

g(y) =Υc(y) = 

lEIM 

i=1 

υici(y). (12) 

To compute the coefficients c the above equality can be 

transformed to: 

P T Υc(y) =P T g(y) ⇔ c(y) =(P T Υ) −1 P T g(y) (13) 

with the matrix P =[ep1,...,epl ] where epi ∈ Rm 

EIM 

is the pith unit vector. 

Since the singular values of the nonlinearity on the 

right hand side of the system decay slowly, the use of 

EIM is not useful here. Luckily, the eigenvalues on the 

left hand side of the system decay fast and EIM can be 

used. This substitutes the matrix assembly. 

In equation (10) the nonlinearities are part of the 

stiffness matrix K and its derivative JK. As both cases 

are handled analogously we explain the case for the 

stiffness matrix only. The nonlinearities in the different 

blocks of the stiffness matrix are dealt separately and 

therefore we drop one pair of indices and denote the 

entries of an arbitrary block of the stiffness matrix as 

 

Kij = g(y)∇ϕi∇ϕjdx, i, j =1,...,m, (14) 

Ω 

where g(y) is the nonlinear function to be approximated. 

When we now insert equation (12) in (14) and combine 

this with the reduced order stiffness matrix ˆ K = ˆ Ψ T K ˆ Ψ 


we obtain: 

ˆK = ˆ Ψ T 

= 

lEIM 

 

lEIM 

k=1 

k=1 

 

 

ck(y) υk∇ϕi∇ϕjdx 

Ω 

ck(y) ˆ Ψ T 

 

 

υk∇ϕi∇ϕjdx ˆΨ . (15) 

 

Ω 

 

ij 

 

The underlined expressions can be precomputed, these 

are lEIM matrices of dimension lPOD×lPOD. Instead of 

the assembly of the stiffness matrix the coefficients c have 

to be calculated using (13), this includes lEIM function 

evaluations. These coefficients are then used in the above 

formula where a linear combination of the precomputed 

matrices is calculated. When lEIM is small a massive 

speed up can be gained by this method. 

IV. RESULTS 

In general, our goal is to replace a slow FE simulation 

with a fast ROM simulation by allowing a small relative 

error between those two simulations. In this section we 

will compare the FE solutions and the ROM solutions 

for accuracy and speed archived by three different approaches. 

We will start with one very simple example. 

A. Simple simulation of battery. 

We consider a simulation where the battery is charged, 

discharged and charged again between 3.8V and 2.5V (all 

three times with 0.2C); see the solid blue line in Figure 3. 

To solve the full model the following spatial discretization, 

given in degrees of freedom (DOFs), for the single 

variables is used: 

φl φsc φsa cl csc csa total 

241 101 101 241 5151 5151 10986 

The simulated time is 61586s and 1451 time steps are 

used due to the adaptive time stepping algorithm. The 

solutions of the full FE system is plotted in Figure 4. 

These solutions were taken as snapshots for the POD 

basis computation for each of the six variables. When we 

cut the scaled eigenvalues, at the tolerance of 10−8 , the 

number of POD basis functions for the single variables 

is as follows: 

Fig. 3. Simple simulation of battery. 

ij 

ˆΨ

Fig. 4. 3d plots of full FE solution for first example. 

φl φsc φsa cl csc csa total 

5 3 3 3 14 12 42 

The total dimension of the ROM is therefore 42. Also 

EIM bases for the different nonlinearities are computed. 

The ROM is then solved using these POD and EIM bases. 

For a comparison between the FE and the ROM solution 

we consider the L2-error. For voltage and current the 

absolute error is 6.6 · 10−06 and 1.0 · 10−04 , respectively. 

A difference between the voltage of the FE solution and 

the ROM solution is shown with asterisk line in Figure 3. 

Also we consider the relative error of each variable which 

is defined for φl as: 

eφl := 

 

 

 

 

 

||φFE l − φROM 

l || L2 (Ω) 

||φFE l || L2 (Ω) 

 

 

 

 

 

L 2 (0,T ) 

, (16) 

and for all other variables analogously. The resulting 

relative errors for each variable are: 

φl φsc φsa 

2.2 · 10 −6 

1.3 · 10 −6 

cl csc csa 

1.7 · 10 −7 

4.2 · 10 −6 

2.6 · 10 −5 

5.5 · 10 −6 

All of these relative errors are acceptably small. Now 

that we checked the accuracy of the ROM solution we 

are interested in the speed of the computation. The full 

FE solution took 2828s in CPU time while the ROM 

solution lasted 189s, which means that a speed up of 

factor 15 was achieved. 

B. More general POD and EIM bases. 

We need POD and EIM bases which are more general 

and applicable for a broader range of use cases, concerning 

the battery behavior. This is achieved this by building 

one basis out of a set of similar simulations. 

In battery simulation a natural choice for this set of 

simulations is the discharge of the battery at different 

C-rates. To be more precise: we discharge the battery 

from 3.8V until 2.5V at the C-rates 0.1, 0.2, 0.5, 1, 

2, 3, 4 and 5C. From these set of snapshots we build 

the POD and EIM bases by using again the tolerance of 

10−8 . The ROM is of a dimension of 85, compared to 

the full dimension of 10986 DOFs. To make sure that 


Fig. 5. Voltage of simulations at different C-rates. 

Fig. 6. Voltage of simulations not used for POD basis calculation. 

the dynamics of the different C-rates are captured well, 

we repeat all discharge simulations used for building 

the basis with the ROM. In Figure 5 we plotted the 

calculated voltage of the full FE solution and the ROM 

solution using the new POD and EIM bases. We detected 

a extremely small L 2 relative error and we conclude that 

the results are in good agreement for all used C-rates. 

Next we want to find out whether C-rates not used 

for the basis computation, are also simulated well by the 

ROM. For this purpose we simulate the C-rates: 0.15, 

0.3 and 3.5 with the full FE system and the ROM. The 

results are in good accordance as can be seen in Figure 6. 

We conclude that all C-rates between 0.1 and 5 can be 

simulated well with the ROM using the improved, more 

general, basis. 

C. POD basis switching. 

With the single POD basis we can simulate well 

different discharges in the 0.1 - 5 C-rate range and now 

we want to use the ROM for a simulation that involves 

charge and discharge processes. We have the set of POD 

and EIM bases for a discharge. Equivalently using the 

same C-rates, we calculate the POD and the EIM basis for 

the charge processes. Using the same tolerance, 10−8 ,we

Fig. 7. Voltage of simulations where bases for ROM computation 

were switched. 

obtain 110 POD basis functions. In the ROM simulation 

we can then switch between these two sets of bases. 

This is demonstrated through the simulation of 

charge/discharge processes were charge and discharge are 

simulated at C-rates not included in the bases computation. 

The voltage plot of the full FE solution and the 

ROM solution can be seen in Figure 7. 

The voltage curves are in good accordance, only at the 

switching points there occurs an error. Both the discharge 

and the charge POD basis span a subspace in the space of 

all possible solutions. When switching between discharge 

to the charge basis (or vice versa) the ROM solution is 

projected from one subspace to the other and an error 

occurs. 

Below the relative errors for each variable are given, 

see (16) for the definition of the error norm. 

φl φsc φsa 

1.4 · 10 −5 

3.4 · 10 −6 

cl csc csa 

4.1 · 10 −6 

1.3 · 10 −5 

3.8 · 10 −5 

3.0 · 10 −5 

Naturally the error in this simulation is bigger than in 

the first example, nevertheless the results are acceptable. 

This method seems to be very suitable to allow a fast 

simulation of different processes, as pulses, rest steps, 

cycling etc. To find appropriate POD and EIM bases for 

these processes, which also give a small projection error, 

remains our next task. 

V. CONCLUSION 

For speeding up the numerical simulation of the 

lithium-ion battery model, a reduced order model was 

build utilizing proper orthogonal decomposition method 

and empirical interpolation method. 

Speed up factors of 10-15 (depending on the considered 

case) with acceptable error was obtained. The 

developed method of switching between POD and EIM 

bases for different purposes, shows good results for 

charging/discharging processes at different C-rates. This 


method is very promising to allow a fast simulation of 

different processes, e.g. pulses, rest steps, cycling etc. 

Our future research will aim for the construction of 

appropriate POD bases for these processes. Also the 

question whether the projection error can be minimized 

will be considered. 

Acknowledgment: 

The authors gratefully acknowledge financial support 

from “Zukunftsfonds des Landes Steiermark” of the Federal 

Province of Styria/Austria for the project in which 

the above presented research results were achieved. 

VI. LIST OF SYMBOLS 

cs 

concentration of Li + in active material 

cl 

concentration of Li + in electrolyte 

Φs 

electrochemical potential of active material 

Φl 

electrochemical potential of electrolyte 

Ai 

inner active surface 

αA,αK anodic/cathodic charge transfer coefficients 

CDS double layer capacity 

Dl 

solution diffusivity 

Ds 

solid diffusivity 

εl 

electrolyte volume fraction 

jBV 

Butler-Volmer current density 

F Faraday’s constant (= 96485Cmol −1 ) 

k exchange current density and reaction rate 

κl(cl) ionic conductivity function 

μl 

migration coefficient 

σs 

electronic conductivity 

R universal gas constant (= 8,31447 Jmol −1 K −1 ) 

T temperature 

t time 

t + 

transference number 

UOCV (cs) equilibrium potential function 

USEI ohmic loss of potential due to solid electrolyte interface at 

ΩA 

z number of transfered electrons (for Li + : z =1) 

REFERENCES 

[1] A.C. Antoulas, “Approximation of Large-Scale Dynamical Systems.” 

in Siam, 2005. 

[2] M. Barrault, Y. Maday, N. Nguyen, A. Patera, “An empirical 

interpolation method: Application to efficient reduced basis 

discretization of partial differential equations.” Comptes Rendus 

Mathematique, 339: 667-672, 2004. 

[3] P. Benner, V. Mehrmann, D.C. Sorensen et. al., “Dimension Reductin 

of Large-Scale Systems.” Springer, 2003. 

[4] L. Cai and R.E. White, “An Efficient Electrochemical–Thermal 

Model for a Lithium-Ion Cell by Using the Proper Orthogonal 

Decomposition Method.” in J. Electrochem. Soc., vol. 157, pp. 

A1188-A1195, 2010 

[5] M. Cifrain et. al., “Elektrochemisches Zellmodell.” publication in 

preparation, 2012. 

[6] M. Doyle, T.F. Fuller, J. Newman, “Modeling of Galvanostatic 

Charge and Discharge of the Lithium/Polymer/Insertion Cell.” in 

J. Electrochem. Soc., vol. 140 (7), pp. 1526–1533, 1993. 

[7] O. Lass and S. Volkwein, “POD Galerkin schemes for nonlinear 

elliptic-parabolic systems.” submitted for publication in 2011 

[8] F. Pichler, “Anwendung der Finite-Elemente Methode auf ein 

Litium-Ionen Batterie Modell.” Master Thesis, University of Graz, 

2011. 

[9] S. Volkwein, “Proper orthogonal decomposition (POD) for nonlinear 

systems.” PhD program in Mathematics for Technology 

Catania, 2007. 

[10] J. Wu, J. Xu, H. Zou, “On the well-posedness of a mathematical 

model for lithium-ion battery systems.” Methods and Applications 

of Analysis, 13:275-298, 2006.


Automatic domain detection for a meshfree postprocessing 

in boundary element methods 

André Buchau, Matthias Jüttner, and Wolfgang M. Rucker 

Universität Stuttgart, Institut für Theorie der Elektrotechnik, Pfaffenwaldring 47, 70569 Stuttgart, Germany 

E-mail: andre.buchau@ite.uni-stuttgart.de 

Abstract—Modern advanced visualization techniques for three-dimensional electromagnetic fields evaluate field values in 

some points in space, which are determined only during the computation of visual objects like streamlines. Furthermore, a 

meshfree post-processing in boundary element methods along with a bidirectional coupling of numerical field computations 

with a visualization tool are advisable to reduce significantly computational costs and the total amount of stored data. However, 

a completely automatic domain detection method is then required. Domain data like material values are not explicitly 

defined due to the lack of a volume mesh but are necessary for a correct computation of field values in arbitrary points. Here, 

a robust and fast octree-based method is presented to determine the domain data of an evaluation point efficiently even for 

large and complex field problems. The computational costs of position detection of a single evaluation point are kept small by 

filtering of relevant boundary elements. Furthermore, position data of other evaluation points is used if possible. 

Index Terms—boundary element methods, domain detection methods, meshfree post-processing, octree-based schemes 


A very important step in numerical field computations 

is an extensive post-processing including a vivid visualization 

of the obtained results. Today, visualization tools 

like virtual and augmented reality along with modern 

visualization techniques enable even experienced engineers 

a deep insight into the physical properties of the 

studied problem [1, 2]. However, an expressive visualization 

of three-dimensional fields is still a challenge. One 

possibility is to use volume rendering in the case of scalar 

data [3]. An interesting approach, which has been presented 

for two-dimensional magnetic fields, is to compute 

visual objects that represent the topology of the vector 

field [4]. Further techniques, which are more commonly 

used, are to filter three-dimensional data first and to visualize 

three-dimensional fields in slices or to compute 

streamlines or isosurfaces. Data exchange between the 

numerical field computation tool and the visualization 

tool is normally done with the help of volume meshes of 

all considered domains. The field values are precomputed 

in all nodes of this mesh and transferred to the 

visualization tool that performs the post-processing independently 

of the numerical field computation tool. 

The boundary element method (BEM) is a very attractive 

method for the solution of three-dimensional electromagnetic 

field problems, which consist of multiple, 

piece-wise homogeneous, linear media. Then, a modeling 

and discretization of domain surfaces suffices. Hence, the 

discretized model is much smaller than in volume-based 

methods like the finite element method (FEM). Furthermore, 

well-established compression techniques for the 

linear system of equations exist to enable a fast and efficient 

solution of large and complex field problems [5]. 

However, an additional volume mesh is normally created 

for the post-processing [6]. Domain data like material 

values are assigned to the auxiliary mesh and are available 

for the computation of field values in the mesh nodes. 

The application of the fast multipole method (FMM) 

enables the computation of field values in a huge number 

of points at acceptable computational costs [7], but the 

amount of data, which must be stored and transferred to 

the visualization tool, is relatively large and a bottleneck. 

A better approach, which fits much more the basic 

concept of a BEM, is a meshfree post-processing [8]. 

There, field values are only computed at points, which are 

necessary for visualization. Hence, the number of evaluation 

points is dramatically reduced in comparison to the 

classical approach, which uses an auxiliary volume mesh. 

Furthermore, the meshfree approach is much more flexible. 

Post-processing domains and visualization techniques 

are defined completely after the solution of the 

problem. The creation of an expensive volume mesh is 

unnecessary and the BEM is integrated into the visualization 

tool. However, automatic domain detection is required 

to make a meshfree post-processing applicable for 

problems, which consist of multiple domains. 

Here, a novel method is presented that completely automatically 

detects the position of an arbitrary evaluation 

point directly from the given boundary element mesh. 

The octree-based method is fast and efficient to enable 

extensive post-processing of large and complex field 

problems. A very small number of boundary elements are 

extracted from the total model to perform the position 

detection efficiently with a method that is similar to the 

well-known ray tracing method [9]. Furthermore, domain 

data of other evaluation points is used if possible. 

The paper is structured as follows. First, the problem 

of position detection directly from boundary elements is 

formulated and the concept of the presented method is 

shown. Then, a flexible octree-based scheme is introduced 

to enable a grouping of all boundary elements 

regarding their position in three-dimensional space. It is 

applied to determine the position of an evaluation point 

with the help of other evaluation points of the same domain 

or to filter candidate boundary elements for the 

actual position detection, which is described afterwards. 

There, a method similar to ray tracing is presented to find 

a boundary element that is the closest boundary element 

to the given evaluation point with the help of a small list 

of candidates. Two numerical examples have been studied 

to demonstrate robustness and efficiency of the presented 

automatic domain detection method. Finally, significance 

of the method to BEM and an outlook to future 

work are given in the conclusions.

II. NUMERICAL METHOD 

In general, automatic domain detection is required for 

each evaluation point in a meshfree BEM postprocessing. 

Since both the number of evaluation points 

and the number of boundary elements are often very large 

in practical three-dimensional problems, a fast algorithm, 

which exploits properties of BEM, is advisable. 

The concept of the presented approach is presented in 

the first sub-section. There, a technical formulation of the 

problem is given along with a comparison to ray tracing 

methods. The second sub-section is about the novel fast 

octree-based scheme. It is the key to a successful application 

of meshfree post-processing in large BEM problems. 

Finally, an efficient and general method of ray tests based 

on gradient search is given in the third sub-section. Note, 

the presented approach has been developed to achieve 

two goals, an efficient and robust automatic domain detection 

method and a flexible octree for fast and efficient 

post-processing computations using the fast multipole 

method (FMM). 

A. Concept of automatic position detection 

The aim of the presented automatic domain detection 

method is to determine the domain of an arbitrary 

evaluation point, which is defined by its global Cartesian 

coordinates, e. g. during the computation of a streamline, 

= 

. (1) 

The complete BEM problem is discretized by in total 

boundary elements. The shape of boundary elements 

and the order of their shape functions can be arbitrarily 

chosen for the presented domain detection method. The 

direction of the normal vector of a boundary element 

is known. Furthermore, the domain , which lies in 

direction of , and the domain , which lies in 

direction of , are stored for each boundary element. 

The concept of the presented approach is to construct a 

single ray, which starts at the given evaluation point 

= + , (2) 

where is the direction of the ray and 0 is a parameter. 

Then, the domain is determined with the 

help of the first boundary element, which is intersected 

by the ray (2). An example configuration is given in 

Fig. 1. The red point is the given evaluation point, the 

blue line is the ray, and the green and yellow lines are 

two boundary elements. Here, the domain of the evaluation 

point is = of the green boundary element. 

Fig. 1: Example of domain detection with the help of a ray 

The concept of domain detection is similar to the wellknown 

ray tracing method [9]. The main difference is that 

the direction of the ray is unknown. In ray tracing, the 

direction corresponds to the view direction. Furthermore, 

a single ray suffices for successful domain detection. In 

ray tracing of computer graphics, a ray for each pixel of 

the screen is constructed. Hence, an adapted octree-based 

algorithm for BEM is presented in the next sub-section. 


B. Fast octree-based algorithm 

A fast algorithm is necessary to enable an application 

of the automatic domain detection method to large BEM 

problems. The costs of a standard implementation of the 

concept, which has been presented in the previous subsection, 

are proportional to . Hence, an efficient filtering 

of relevant boundary elements is required to reduce 

computation time. Furthermore, the computational costs 

are proportional to the number of evaluation points , which are given by the visualization tool. However, an 

evaluation point is often close to a previous evaluation 

point, for instance points on a streamline. An approach to 

reduce the computational costs is to use domain data of 

previous evaluation points if possible. 

Trees are a common method to group and select 

boundary elements in three-dimensional space. Here, an 

octree has been chosen, since the FMM is used to accelerate 

BEM computations and the FMM is based on an 

octree, too [10]. An implementation of the octree, which 

uses modern software techniques, enables the application 

of the same code foundations for both domain detection 

and post-processing computations. Then, code is more 

reliable and the complete method works more robust. 

The first step is to initialize the octree using all boundary elements. The so-called root cube at octree 

level 0 is the smallest cube, which encloses all boundary 

elements. Its edges are parallel to the axes of the global 

Cartesian coordinate system. Then, the root cube is subdivided 

into eight equal sized cubes and the boundary 

elements are assigned to these so-called child cubes. Each 

child cube is again subdivided into child cubes. The subdivision 

is continued while the boundary elements of a 

cube can be assigned to its child cubes or the total number 

of octree levels is smaller than a given limit. 

Boundary elements are assigned to a cube, if their centroid 

lies inside the cube. Of course, the position of 

boundary elements and their real dimensions must be 

considered. The bounding box of a cube is determined 

including all its boundary elements. This bounding box 

must completely lie inside a test cube with the same center 

as the considered cube and an edge length of 

The domain detection starts with the addition of the 

evaluation point to the octree. If is outside the 

root cube of the octree, the spatial domain of the octree is 

enlarged by creating a new root cube. Of course, is 

increased in that case. The evaluation point is assigned to 

a cube in the same way as the boundary elements during 

initialization. Here, the subdivision of a cube with an 

evaluation point is aborted, if no boundary elements are 

assigned to the cube of the evaluation point. The number 

of evaluation points in a cube is not taken into account. 

Hence, the cubes of evaluation points are chosen as large 

as possible. 

An example is given in Fig. 2. The black lines with 

black points represent some boundary elements. The red 

point is an evaluation point. The thick black lines are the 

octree cubes, which have been created according to the 

above-described rules. One boundary element is assigned 

to one cube and the boundary elements stick slightly out 

of the cubes. In contrast, the cube of the evaluation point 

is relatively large. 

Fig. 2: Example of an octree for position detection 

A strategy is to reduce the number of actual domain 

detections. Hence, a goal is to determine the domain not 

only for the given evaluation point but also for the cube 

of the evaluation point if possible. Consequently, an initial 

cube is searched that fulfills the following conditions. 

The octree level of a cube of an evaluation point is 

maximal the finest octree level of cubes of boundary 

elements. While the octree level criterion is satisfied, the 

cube of the evaluation point is refined until no boundary 

elements are assigned to that cube or no boundary elements 

stick into this cube. To avoid expensive tests based 

on the real position of boundary elements, first a cube is 

searched that has no neighbors with boundary elements. 

If the cube of the evaluation point already has no neighbors 

with boundary elements, its parent cube is tested for 

the above-described rules. 

The black cube of the evaluation point in Fig. 2 has 

neighbor cubes with boundary elements. Hence, the black 

cube is refined and the blue cube is obtained. Since the 

blue cube has also neighbor cubes with boundary elements, 

it is refined as well and the green cube is the initial 

cube for domain detection. 

If no elements are lying inside the cube , the domain 

is determined for including all its child cubes. The 

domain data of the cube is then used for all evaluation 

points, which are assigned to the cube at a later moment. 

Neighbor cubes of the cube are at several octree levels 

due to the adaptive octree rules of octree initialization. 

Furthermore, octree structure is changing during the post- 


processing. Hence, it is not possible to determine cube 

neighbors in advance. Neighbor cubes, or cubes in general, 

are searched by the position of the cube center. Here, 

the centers of possible neighbor cubes are computed and 

the cubes are searched starting from the root cube by 

simple and fast comparison of Cartesian coordinates. 

If the domain data of a neighbor cube of cube has 

been already determined, it can be used for the cube , 

too. Otherwise, a cube with boundary elements of the 

second neighbors of cube has to be chosen for domain 

detection. Since no elements are assigned to the cube 

and its neighbor cubes, one second neighbor cube, which 

adjoins a neighbor cube of , suffices for domain detection. 

At least one second direct neighbor cube with 

boundary elements exists, because the cube is chosen 

as large as possible as described above. 

After a cube has been chosen, the neighbor cube 

of , which lies between and , is determined. 

First, all boundary elements, which stick into , are 

searched. Note, most neighbors of are without 

boundary elements and only neighbors, which adjoin 

, have to be considered. In practice, the number of 

relevant cubes is small. Furthermore, the boundary elements, 

which are assigned to and which stick into 

, are determined. In total, a list with relevant 

boundary elements is obtained. To improve robustness of 

the domain detection method, a new point for domain 

detection is defined inside and close to . 

Some examples of typical situations of domain detection 

are depicted in Fig. 3. 

Fig. 3: Some typical situations of domain detection 

First, the domain of the red evaluation point is determined. 

Since no boundary elements are assigned to the 

red cube, the domain of the red cube is evaluated. The 

black cube with the red lines is chosen as cube . The 

orange evaluation point inside is used for the domain 

detection of the read evaluation point. Next, the 

domain of the blue evaluation point is detected. Since the 

blue evaluation point lies within a cube with boundary 

elements, these boundary elements are used for domain 

detection. Finally, the domain of the green evaluation 

point is determined with the help of domain data of its red 

neighbor cube. 

As already mentioned, the list of relevant 

boundary elements for domain detection includes all 

boundary elements, which are assigned to a relevant cube 

and all boundary elements of neighbor cubes, which stick 

into the relevant cube. A fast method to test whether a 

boundary element sticks into a cube is to test an intersection 

of the bounding box of the cube and of the bounding

ox of the boundary element. Two examples of boundary 

elements inside a cube are given in Fig. 4. Although no 

boundary elements are assigned to the blue cube, the blue 

boundary element of its neighbor cube must be taken into 

account, since it sticks into the blue cube. In the case of 

the red cube, one red boundary element, which is assigned 

to the red cube, and one red boundary element of 

its neighbor cube have to be considered. 

Fig. 4: Examples of elements inside a cube 

In total, the presented octree-based method is fast, 

since is approximately independent of , at least 

in the case of large problems. If possible, domain data for 

cubes is determined. The domain detection with the help 

of the filtered boundary elements is described in 

the following sub-section. 

C. Domain detection from boundary elements 

Starting position of a domain detection from boundary 

elements is a very small list of boundary elements, 

which is determined using the octree-based method of the 

previous sub-section, and the given evaluation point (1). 

Furthermore, domain detection from boundary elements 

is only necessary, if domain data of octree cubes cannot 

be used. Hence, general applicability and extension possibilities 

of the following method are more important than 

pure efficiency considerations. 

The initial step is to sort the boundary elements 

by their distance to the evaluation point 

= , 0

III. NUMERICAL EXAMPLES 

The presented automatic domain detection method for 

BEM has been tested on two numerical examples. The 

first example is a capacitor, which can be simply discretized 

with different sizes of boundary elements. Hence, it 

is well suited for fundamental tests of the automatic domain 

detection method. The second example is an inductor 

of a micro-electro-mechanical system. It represents a 

typical configuration in an application of BEM with often 

changing domains in a slice. There, domain data of evaluation 

points cannot be easily defined and the power of 

the presented automatic domain detection method is 

clearly demonstrated. 

The domain detection method has been implemented in 

C# using the .NET framework 4.0 [11]. C# is a managed 

language and it supports very well necessary data handling 

of the domain detection method. Furthermore, the 

interface of C# to native C++ enables the use of existing 

high-performance code of numerical methods, for instance 

the used BEM and FMM implementation. Bounding 

boxes including intersection tests or vector operations 

are standard methods of the Windows Presentation 

Framework (WPF), which is part of the .NET framework. 

Furthermore, the Windows Communication Foundation 

(WCF) of the .NET framework enables an interactive 

data exchange between different processes based on extensible 

markup language (XML) over hypertext 

transport protocol (http). Here, WCF is applied to couple 

the process of the visualization tool HLRS COVISE, 

which is developed at the High Performance Computing 

Center at the University of Stuttgart, with the implementation 

of the domain detection method. HLRS COVISE 

visualizes three-dimensional data with the help of virtual 

and augmented reality techniques [1]. 

The surfaces of both numerical examples have been 

discretized with second order, quadrilateral boundary 

elements. The Galerkin method has been applied to indirect 

and direct BEM formulations. The matrix of the 

linear system of equations has been compressed with the 

help of the fast multipole method [12]. The matrix has 

been assembled in parallel using the OpenMP standard. 

The system of linear equations has been solved in parallel, 

too. An implementation of the BEM in combination 

with the FMM in C++ has been executed on a workstation 

with two six-core Intel Xeon E5649 2.53 GHz 

processors. 

Although the presented domain detection method supports 

all kinds of boundary elements, the second order, 

quadrilateral boundary elements have been converted into 

linear, triangular boundary elements for the postprocessing. 

The reason is that in computer graphics only 

linear elements, often only linear triangles, are well supported. 

Rendering and graphics processing on linear triangles 

is much faster than on other types of elements and 

linear triangles are supported by modern graphics processors. 

The implementation of the domain detection method 

has been executed on an Intel Core 2 Duo T9900 

3.06 GHz laptop processor using a single core. Graphical 

objects have been rendered on a NVIDIA Quadro FX 

770M graphic card. 


A. Capacitor 

The electric field of a capacitor has been studied as 

first example. The capacitor consists of two quadratic 

electrodes and a homogeneous, linear, isotropic dielectric 

between the electrodes. The potential of the electrodes 

has been set to 0.5 V and -0.5 V respectively. The relative 

permittivity of the dielectric is 10. 

The capacitor has been discretized with 9600 second 

order, quadrilateral elements (Fig. 5). An indirect BEM 

formulation is applied. The Dirichlet boundary conditions 

are the potential at the two electrodes. The Neumann 

boundary condition is the continuity of the electric flux 

density at the surface between the dielectric and the surrounding 

free space domain. The corresponding linear 

system of equations with in total 29442 unknowns has 

been solved iteratively using generalized minimal residual 

method (GMRES) along with a Jacobi preconditioner 

within 91 iteration steps in approximately 3 minutes. 

Fig. 5: Discretized BEM model of a capacitor 

The original second order boundary elements have 

been converted into 19200 first order, triangular elements 

for the post-processing including the domain detection. 

The boundary elements are grouped by an octree, which 

consists of 9 octree levels and 2 elements assigned to a 

cube in average. The maximum number of elements of a 

cube is 6. The domain data in a slice in 40000 evaluation 

points has been determined in 23 s (Fig. 6). The red color 

represents the domain inside the dielectric and the green 

color represents the air domain. Furthermore, the two 

electrodes are depicted. The color at the electrodes displays 

the surface charge density, which equals the solution 

of the linear system of equations. The surface of the 

dielectric has been omitted for graphical reasons. 

The presented octree-based scheme reduces the number 

of boundary elements, which must be considered for 

correct domain detection, from 19200 to maximal 39. The 

domain data of 93 % of the given evaluation points could 

be obtained from position data of the octree cubes without 

expensive ray hit tests.

Fig. 6: Detected domains in a slice through the capacitor 

B. Inductor in micro-electro-mechanical systems 

The electric current inside an inductor of a microelectro-mechanical 

system (MEMS) has been studied as 

second example. The inductor has been discretized using 

9168 second order, quadrilateral elements (Fig. 7). The 

potential at the ports of the inductor has been set as Dirichlet 

boundary condition of a direct BEM formulation. 

The linear system of equations with in total 27594 unknowns 

has been solved within 166 iteration steps of 

GMRES in approximately 4 minutes. 

Fig. 7: Discretized BEM model of a inductor in MEMS 

The boundary elements have been converted into 

18336 first order, triangular elements for the postprocessing. 

The domain in 40000 evaluation points, 

which are lying in a slice through the inductor (Fig. 8), 

has been detected in 106 s. The yellow color represents 

the domain inside the conductor of the inductor and the 

blue color represents the surrounding free space domain. 

Although the slice is often intersected by boundaries of 

the inductor, the number of actual domain detections is 

reduced by 45 % by using domain data of the octree cubes. 

is maximal 26. 

Fig. 8: Detected domains in a slice through the inductor 



A fast and efficient automatic domain detection method 

for a meshfree post-processing in three-dimensional 

boundary element methods has been presented. Relevant 

boundary elements are filtered with the help of an adaptive 

octree-based method. Hence, the number of boundary 

elements, which have to be considered for domain detection, 

is extremely reduced and approximately independent 

of the total number of boundary elements for large problems. 

The application of the octree, bounding boxes, and 

optimized standard libraries results in very low computational 

costs. As a result, the shown domain detection 

method enables an efficient post-processing in a large 

number of evaluation points even for large and complex 

BEM problems. Furthermore, the complete method including 

its implementation is very flexible and supports 

all types of elements. Hence, it is not only restricted to 

pure BEM applications, but volume elements of a volume 

integral equation can be used, too. Finally, the numerical 

examples show that domain data is detected in arbitrary 

chosen evaluation points reliably and efficiently. 

The shown method is a very important step towards 

flexible and powerful post-processing in boundary element 

methods. A direct coupling of visualization tools 

with a boundary element method is enabled. Postprocessing 

objects as streamlines or isosurfaces can be 

computed totally meshfree even in the case of multiple 

domains. The octree, which is used here, is the basis of 

fast and efficient field computations using the fast multipole 

method, too. 

[1] 

REFERENCES 

U. Lang and U. Wössner, “Virtual and augmented reality developments 

for engineering applications”, Proceedings of 

[2] 

ECCOMAS 2004, Jyväskylä, July 24-28, pp. 24-8., 2004 

A. Buchau, W. M. Rucker, U. Wössner, and M. Becker, “Augemented 

reality in teaching of electrodynamics”, COMPEL, vol. 

28, no. 4, pp. 948-963, 2009 

[3] D. Weiskopf, „GPU-Based Interactive Visualization Techniques“, 

Springer, 2006 

[4] S. Bachthaler, F. Sadlo, R. Weeber, S. Kantorovich, Ch. Holm, 

and D. Weiskopf, “Magnetic Flux Topology of 2D Point Dipoles”, 

Eurographics Conference on Visualization (EuroVis) 2012, vol. 

31, no. 3, 2012 

[5] A. Buchau, W. M. Rucker, O. Rain, V. Rischmüller, S. Kurz, S. 

Rjasanow, “Comparison Between Different Approaches for Fast 

and Efficient 3D BEM Computations”, IEEE Transactions on 

Magnetics, vol. 39, no. 3, pp. 1107-1110, 2003 

[6] W. Hafla, A. Weinläder, A. Bardakcioglu, A. Buchau, and W. M. 

Rucker, “Efficient Post-Processing with the Integral Equation 

Method”, COMPEL, vol. 26, no. 3, pp. 873-887, 2007 

[7] A. Buchau, W. Rieger, and W. M. Rucker, “Fast Field Computations 

with the Fast Multipole Method”, COMPEL, vol. 20, no. 2, 

pp. 547-561, 2001 

[8] A. Buchau and W. M. Rucker, “Meshfree Visualization of Field 

Lines in 3D”, 14 th IGTE Symposium, pp. 172-177, Graz, 2010 

[9] J. Goldsmith, J. Salmon, “Automatic Creation of Object Hierarchies 

for Ray Tracing”, IEEE Computer Graphics and Applications, 

vol. 7, no. 5, pp. 14-20, 1987 

[10] A. Buchau, Ch. J. Huber, W. Rieger, W. M. Rucker, ”Fast BEM 

Computations with the Adaptive Multilevel Fast Multipole Method”, 

IEEE Transactions on Magnetics, vol. 36, no. 4, pp. 680-684, 

2000 

[11] “.NET Framework Developer Center”, Microsoft Corporation 

[12] A. Buchau, W. Hafla, F. Groh, and W. M. Rucker, ”Parallelized 

Computation of Compressed BEM Matrices on Multiprocessor 

Computer Clusters”, COMPEL, vol. 24, no. 2, pp. 468-479, 2005


Efficient modeling of coil filament losses in 2D 

L. Lehti∗ ,J.Keränen †∗ ,S.Suuriniemi∗ , T. Tarhasaari∗ , and L. Kettunen∗ ∗Tampere University of Technology - Electromagnetics, P.O. Box 692, FI-33101 Tampere, Finland 

† VTT Technical Research Centre of Finland, P.O. Box 1300, FI-33101 Tampere, Finland 

E-mail: leena.lehti@tut.fi 

Abstract—Practical estimates for losses in coil filaments of a FEM model are sought for. A low-dimensional function space 

is introduced on the filament-air interface and then suitably extended into the filament to significantly reduce the number 

of unknowns per filament. Careful choice of extensions enables good loss estimate accuracy. The result is a system matrix 

assembly block that can be used verbatim for all filaments, further reducing the cost. Both net current and voltage per 

length of the filament are readily available in the problem formulation. 

Index Terms—coil modeling, FEM, winding loss estimate 


Improving the efficiency of electrical machines is an 

important aspect of machine design. Modeling methods 

are required to be as fast and accurate as possible to 

help the designer optimize the machines. One important 

aspect of machine design are Ohmic coil loss estimates. 

They enable the designer to choose the placement of the 

conductors such that the losses are minimized. 

Solving for the conductor losses is not a straightforward 

task. The losses depend on the conductivity and 

the current density. The current density is affected by 

numerous factors, which cannot be separately solved for. 

The different elements include the feeding current, fields 

generated by neighboring conductors, placement of the 

permeable materials, and—depending on the frequency— 

skin effect. 

Different methods have been developed to overcome 

these difficulties. One method is to replace the conductivity 

of the material by other parameters which 

transform the eddy-current losses to hysteresis losses 

[1]. In this approach the parameters for resistance and 

inductance are sought for. A separate mesh for magnetic 

and electric problems are introduced in [2]. The results 

are acceptable, but the method has computationally slow 

segments. In [3] a cell-model is used to reduce the 

computational effort and to extract the resistance of 

windings. A homogenization technique in [4] derives 

parameters to characterize skin and proximity effects in 

windings. Surface impedance methods have also been 

used [5], where it is assumed that the magnetic flux does 

not penetrate into the conducting material. Therefore the 

conducting material can be approximated on the surface 

only and the interior of the material can be ignored. This 

can be used for conductors with low curvature and small 

skin depth. 

We aim at a good trade-off between moderate calculation 

time and accuracy of loss estimates while maintaining 

an explicit connection to the exterior circuit. 

In addition, the conductors can be placed freely, i.e. a 

periodical spacing is not required, as in [3] and [4]. 

Previously [6], magnetic flux was not allowed to enter the 

filaments and a separable problem was achieved, i.e. the 

field problem was divided into the filament interiors and 

the exterior. The exterior and interior had only limited 

interaction through net current and constant stream function 

values on the boundary. To improve the accuracy, 

it is necessary to admit some magnetic flux into the 

filaments while solving for the exterior problem. Ignoring 

the magnetic energy and losses inside the filaments on 

the exterior problem enables subsequent solving for the 

interior problem, but it leads to a very small reluctance 

inside the filaments. When the filaments are close to each 

other, this is detrimental. 

Consequently, the filament interiors have to be coupled 

with the exterior problem. To save computational effort, 

the function space on the filament interface is significantly 

limited. The basis representing the result inside 

the filaments is spanned by solutions of eddy current 

problems that use the functions from the interface as 

boundary conditions. The use of these solutions improves 

the loss estimates significantly compared to [6] and the 

same solutions can be used for all filaments with similar 

cross-section. The resulting method is called an interface 

technique. 

II. EXTERIOR FORMULATION 

Here, we concentrate on two-dimensional cases, since 

they are widely used in industry. Solving for 2D problems 

is simple, and they provide enough accuracy for many industrial 

applications. Since we look for a time-harmonic 

solution, the materials used are assumed to be linear. 

However, the method could be extended to nonlinear 

time-domain problems with convolution techniques if the 

material of the filaments stays linear. 

A few terms are represented in Figure 1 for convenience. 

Exterior problem refers to Ωe and interior 

problem to Ωin = Ωj. The geometry contains K 

conducting filaments, Ωj, and the relative permeability of 

the core is 1000. The whole domain is Ω= Ωj ∪ Ωe 

and on the boundary of Ω the stream function is set to

zero. This geometry, with K =25and f =50Hz, is 

also used as an example in the computations. The radius, 

r, for each filament is 0.01m to produce a challenging 

modeling problem (skin depth/radius ≈ 1). 

symmetry axis 

∂Ωj 

Ω=Ωin ∪ Ωe 

Ωj 

Ωe 

b · n =0 

Fig. 1. An example geometry for a transformer with an E-shaped core. 

There are 25 conductors in the coil and their net current is set to one. 

The core material’s relative permeability is 1000 and f =50Hz. Ω is 

the whole domain that consists of Ωin = Ωj and Ωe. 

In our previous work [6], the floating potential approach 

with a constant but unknown stream function on 

each filament boundary was used. This prevented the flux 

from entering the filament and the eddy currents were 

similar in all filaments. Now, we replace the constant 

potential with a low-dimensional subspace of functions, 

L, that enables the magnetic flux to enter the filament. 

The function space comprises of a constant function and 

trigonometric functions. Let ˆ L be the space of Whitney 

nodal basis function interpolations of L. We construct the 

following formulation for the exterior problem 

div 1 

μ grad a =0 in Ωe, (1) 

a =0 on ∂Ω, (2) 

 

∂Ωj 

a = 

M−1 

i=0 

cijfi ,fi ∈ ˆ L on ∂Ωj ∀ j, (3) 

h · dl = Ij on ∂Ωj ∀ j. (4) 

Here μ is the permeability, a the stream function, 

M the number of functions on the interface, cij are 

complex scalar coefficients, h the magnetic field, and Ij 

the net current in j th filament. There can be multiple 

trigonometric functions, i.e., (3) can be written 

a = c0 + 

N 

(c2n−1 sin nα + c2n cos nα), (5) 

n=1 


where N is the number of trigonometric functions and 

α an angle parameter. 1 For N =0this reduces to floating 

potential. On the interface, other than trigonometric 

functions could be used. 

The solution for (1)–(4) is sought for in the form 

a = 

diλi + 

cij ˆ fi, (6) 

i 

where each λi is a Whitney nodal basis function associated 

to the interior nodes of Ωe, and ˆ fi are approximated 

by Whitney interpolants on the boundary and the extension 

of these interpolants into the domain Ωe is done 

canonically. 

In practice, the function space L on the interface is 

reduced to ˆ L by using a projection matrix Q. InQ we 

have a row for each basis function of ˆ L for each filament 

and columns for all nodes on the boundary. The i th row 

of Q consists of the values of fi in the boundary nodes 

of one filament. In a standard system, the system matrix 

is formed from blocks 

i,j 

AΩΩ AΩΓ 

AΓΩ AΓΓ 

 

, (7) 

where Ω refers to the exterior of the filaments and Γ 

refers to the filament boundary. We use Q as follows 

ÃΩΩ = AΩΩ 

ÃΩΓ = AΩΓQ T 

ÃΓΩ = QAΓΩ 

ÃΓΓ = QAΓΓQ T 

(8) 

(9) 

(10) 

(11) 

to build the new system matrix Ã in the same format as 

(7). Note that the block with most nodes, i.e. ÃΩΩ, is not 

transformed. The dimensions of the projection matrix are 

(KM) × (boundary nodes). 

Remark 1. If the energy stored and dissipated in the 

filaments is neglected the exterior problem can be independently 

solved for. However, if we use equations (1)– 

(4) without including the effects of the filament interiors’, 

the results are not satisfactory. As an example, we have 

the E-magnet from Figure 1. The model lacks the reluctance 

and eddy current effects from the filament interiors, 

and thus the filaments offer no reluctance to the flux. This 

effect gets more pronounced when the filaments are close 

to each other. In a tightly wound coil the stored energy 

inside the filaments is considerable and, in addition, eddy 

current losses inside the filaments have an effect on 

the exterior magnetic field. In Figure 2, flux lines are 

shown for an exterior solution with a constant, sine, and 

cosine functions. In Figure 3 a reference result with A- 

V-formulation and a finely discretized mesh in the whole 

domain is shown for comparison. The A-V-formulation 

provides an accurate solution, but the computational cost 

is high. 

1 Here sin and cos are to be understood as the Whitney nodal basis 

function interpolations of the trigonometric functions.

Fig. 2. The solution to E-magnet problem of Fig.1 with a constant, 

sine and cosine on the boundary. Since the interior energy is ignored, 

the filaments offer a zero reluctance path for the flux and the flux gets 

attracted into the filaments. 

Fig. 3. The E-magnet problem solved in Ω with an A-V-formulation as 

a reference result. The solution is very accurate, but the computational 

cost is high. 

III. INTERIOR FORMULATION 

A. Expansion of the Basis Functions to the Interior 

Because of the stored and dissipated energy inside 

the filaments and its effect on the exterior, the interiors 

cannot be separated from the exterior completely. Thus, 

we need to have a model for the interior part and solve 

for it simultaneously with the exterior. We take the lowdimensional 

function space ˆ L on the filament interface 

as the starting point. 

For the interior problem we have 

div 1 

Vz 

grad a = jωσ(a + 

μ jω ) in Ωj, 

 

(12) 

h · dl = Ij on ∂Ωj ∀ j, (13) 

∂Ωj 

aj = ae on ∂Ωj ∀ j, (14) 

where Vz the filamentwise constant potential gradient and 

ae is the value of the exterior problem solution on the 


filament boundary. 2 We want to represent the solution to 

this eddy-current problem inside the filament by using 

only the unknowns cij related to the functions of ˆ L plus 

one unknown associated to the net current condition (13). 

Note that computational effort is saved in the interiorexterior 

coupling, since we restrict ae to be spanned by 

the functions of ˆ L. 

Once we have extensions of ˆ L into the filament, we 

can assemble a FEM assembly block and corresponding 

excitation for the problem (12)–(14). The single block 

can then be efficiently used for all filaments of identical 

cross section. It is important to notice that even though 

we have to solve for one boundary value problem (BVP) 

in the filament per basis element of ˆ L, we only have 

to solve them once for filaments with the same crosssection. 

Usually the number of BVPs is much smaller 

than the number of similar filaments. 

To produce accurate loss estimates, we choose the 

extensions to be the solution to (12)–(14) with the basis 

of ˆ L as boundary conditions. Hence, we decompose the 

problem into M +1 separately solvable problems and 

the solution to the eddy-current problem (12)–(14) is a 

linear combination of these solutions. 3 These solutions 

are then used to extend the basis { ˆ fj} (of (6)) inside 

the filaments and to add K extra unknowns for the net 

currents to the exterior problem. Note that we are not 

restricted to a specific geometry, because these solutions 

can be obtained by any means. 

The first M problems to solve are 

div 1 

μ grad ai − jωσai =0 (15) 

with boundary condition ai = fi on ∂Ωj, wherefi∈ˆ L. 

The remaining one problem is used to account for the 

filamentwise constant potential Vz with the following 

div 1 

μ grad aξ − jωσaξ = jωσ1 (16) 

with aξ =0on ∂Ωj and Vz 

jω is spanned by the constant 1. 

As the term (aξ +1) equals 1 on the interface, we can use 

this term to impose the net current of the filament with a 

circulation of h, wherehis the magnetic field. Note that 

the restriction of a0 into Ωj qualifies as (aξ +1), so that 

it involves no extra cost. The solution within the filament 

for the electric field (divided by jω) is expressed by the 

space of linear combinations 

a + Vz 

jω = cξ(aξ +1)+ 

M 

ciai, (17) 

i=0 

where a is the magnetic vector potential inside the 

filament and cξ = Vz 

jω . 

2 At first glance, it might seem like the problem is overdetermined, 

because (14) states a Dirichlet condition on all of the boundary. 

However, we have as many extra conditions (13) as we have constants 

Vz in (12), namely K. 

3 This requires the material to be linear inside the filaments.

B. Assembly Block in Entire Problem 

Let us see how the system matrix of the exterior 

problem is modified when these extended basis functions 

(ai’s and aξ’s) are used. We form an assembly block that 

is added to the system matrix from the problem 

div 1 

μ 

Vz 

grad a = jωσ(a + ) in Ω, (18) 

jω 

because we combine the solution from the interior to the 

exterior problem. The variational formulation for this is 

 

Ω 

w(div 1 

μ 

Vz 

grad a − jωσ(a + )) dΩ =0, (19) 

jω 

where w is a test function. After integration by parts, 

(19) becomes 

 

grad w 1 

grad a dΩ = 

μ 

 

Ω 

∂Ω 

w 1 

 

grad a · n dl − 

μ 

Ω 

wjωσ(a + Vz 

) dΩ. (20) 

jω 

Because of homogenous Dirichlet condition for a everywhere 

on ∂Ω, the boundary integrals are zero for w = ai 

 

Ω 

1 

grad ai 

μ grad a + jωσai(a + Vz 

) dΩ =0. (21) 

jω 

With weight w = aξ +1, the net current condition is 

imposed for each filament. Now, aξ +1 is one at the 

boundary of its support Ωj, and (20) becomes 

 

 

Ωj 

∂Ωj 

grad (aξ +1) 1 

grad a+ 

μ 

jωσ(aξ +1)(a + Vz 

) dΩ = 

jω 

1 1 

 

grad a · ndl = h · dl = Ij, (22) 

μ 

∂Ωj 

where Ij is the imposed net current. 

After substituting (17) into (21) and (22), we can 

reduce most of the assembly block elements of (21) and 

(22) to an integral on the boundary of the filament. For 

solutions ai and aj of (15), consider an integral equation 

 

Ωj 

ai( div 1 

μ grad aj − jωσaj) dΩ =0, (23) 

which holds because aj is a solution of (15). After 

integration by parts we get 

 

 

∂Ωj 

Ωj 

1 

ai 

μ grad aj · n dl = 

grad ai 

1 

μ grad aj + jωaiσaj dΩ, (24) 

which occurs as a basic building block in (21) and (22). 


∂Ωj Ωj 

∂Ωj 

Fig. 4. Extension of the basis function inside the filament with the 

constant boundary condition in the E-magnet example, real part in solid 

line and imaginary part in dashed line. 

∂Ωj 

Ωj 

∂Ωj 

Fig. 5. Extension of the basis function inside the filament with the 

sine boundary condition in the E-magnet example, real part in solid 

line and imaginary part in dashed line. 

C. Example: Circular conductors 

Circular conductors of radius a admit exact solutions 

for (15) in closed form for a constant, sines and cosines: 

a0 = J0( 1 

−jωσμr) 

J0( √ (25) 

−jωσμa) 

a2n−1 = Jn( 1 

−jωσμr) 

Jn( √ cos nφ (26) 

−jωσμa) 

a2n = Jn( 1 

−jωσμr) 

Jn( √ sin nφ (27) 

−jωσμa) 

where Jn are Bessel functions of first kind with order 

n. In Figure 4, we see a cross-sectional view of the 

extension of the constant function (a0) into the filament 

and in Figure 5 the extension of the sine function (a2). 

In Figure 6, flux lines are shown for the interface 

technique with one sine and cosine with the extended 

basis functions. The comparison to A-V-formulated case 

(Fig. 3) shows the flux to be very similar. The figures 

cannot be identical, since the function space of the interface 

technique is severely restricted from the interface 

function space of A-V-formulation and also the basis 

functions inside the filaments differ.

Fig. 6. The flux lines obtained with the interface technique for the 

E-magnet are shown. Here we have taken into account the effect of 

the interior of the filaments to the exterior solution and have a good 

correlation with the reference solution. 

 

 

 

Fig. 7. Filament numbers used in Table I for the E-magnet. 

IV. LOSS ESTIMATES 

By using the interior solution (17), we compute in Ωj 

the time average of the losses 

P = 1 

 

Re{e · j 

2 

∗ } da, (28) 

where e = −jω(a + Vz/jω) and j = σe. 

Losses for selected filaments (see Figure 7) of the 

example in Fig. 1 are shown in Table I. Losses are 

computed for five different filaments with the interface 

technique and A-V-formulation throughout for comparison. 

For the interface technique, the assembly block was 

produced with (25)–(27). The loss estimates were also 

computed analytically. The reference results for the A- 

V-formulation were obtained with GetDP [8] to verify 

our loss estimated from MATLAB R○ . The greatest error 

is in filament 2, where the error is -4.0%. 

In Table II some computationally relevant figures for 

the reference solution and the interface technique are 

shown. The mesh outside the filaments is of an equal 

density in both methods. The system of equations was 

solved using the backslash-operator in MATLAB R○ and 


Filament A-V [mW/m] Interface 

technique [mW/m] 

Error % 

1 1.012 1.009 0.2 

2 0.7185 0.7470 -4.0 

3 0.2634 0.2703 -2.6 

4 0.04252 0.04200 1.2 

5 0.03497 0.03480 0.5 

TABLE I 

LOSSES IN NUMBERED FILAMENTS WITH UNIT CURRENT AND 

f =50HZ. A-VRESULTS FROM GETDP AND INTERFACE 

TECHNIQUE FROM MATLAB R○ . δ/r =0.92, WHERE 

δ = 2/(ωσμ) AND r THE RADIUS OF THE FILAMENT. 

A-V Interface technique Difference (%) 

Nodes 153 796 55 206 -64.1 

DoFs 153 435 49 645 -67.6 

Time [s] 3.928 0.5230 -86.7 

nnz 

No. of DoFs 

1 280 535 367 240 -71.3 

in filaments 98 525 100 

TABLE II 

-99.9 

SOME PERFORMANCE INDICATORS FOR COMPUTATIONS.NNZ IS 

THE NUMBER OF NONZERO ELEMENTS. 

the number of nonzero elements (nnz) in the system 

matrix is much lower than in the A-V-formulation. Most 

of the saved elements are in the conducting regions and 

this saves computation time. Additionally, for the A-Vformulation, 

the number of nodes required to maintain 

accuracy inside the filaments has to increase significantly 

with increasing frequency. 

V. CONCLUSION 

An approach to model coil filament losses was proposed. 

We expanded the function space on the filament 

boundaries from the floating potential approach with 

trigonometric functions. We observed that the filament 

interiors need to be considered as well due to the 

significant effect of the magnetic energy stored and 

dissipated inside them. When interface basis functions 

were extended into filaments with solutions of magnetoquasi-static 

problems, and these were used as FEM basis 

functions, the loss estimates are at the most 4% away 

from A-V-formulated estimates. The computation time is 

significantly reduced in a small problem consisting of 25 

filaments. 


The authors thank Professor Stefan Kurz for discussion 

and comments. 

REFERENCES 

[1] O. Moreau, L. Popiel and J. Pages, ”Proximity Losses Computation 

with a 2D Complex Permeability Modelling,” IEEE Trans. Magn., 

vol. 34, pp. 3616-3619, 1998. 

[2] H. de Gersem and K. Hameyer, ”A Multiconductor Model for 

Finite-Element Eddy-Current Simulation,” IEEE Trans. Magn., vol. 

38, pp. 533-536, 2002. 

[3] A. Podoltsev, I. Kucheryavaya and B. Lebedev, ”Analysis of 

effective resistance and eddy-current losses in multiturn winding 

of high-frequency magnetic components,” IEEE Trans. Magn., vol. 

36, pp. 539-548, 2003.

[4] J. Gyselinck, R. Sabariego and P. Dular, ”Time-Domain homogenization 

of windings in 2-D finite element models,” IEEE Trans. 

Magn., vol. 43, pp. 1297-1300, 2007. 

[5] T. Le-Duc, G. Meunier, O. Chadebec and J.-M. Guichon, ”A 

new integral formulation for eddy current computation in thin 

conductive shells,” IEEE Trans. Magn., vol. 48, pp. 427-430, 2012. 

[6] L. Lehti, J. Keränen, S. Suuriniemi, and L. Kettunen, ”Subsystem 

separation by flux linkage in coil filament modelling,” ACOMEN 

2011, Liège. 

[7] P. Dular, W. Legros, H. De Gersem, and K. Hameyer, ”Floating 

potentials in various electromagnetic problems using the finite 

element method,” Proc. of the 4th int. workshop on electric and 

magnetic fields, 1998, Marseille. 

[8] P. Dular and C. Geuzaine, GetDP: a General Environment for the 

Treatment of Discrete Problems, available: http://geuz.org/getdp/ 



Optimization of Energy Storage Usage 

Arnel Glotic 1 , Peter Kitak 1 , Igor Ticar 1 , Adnan Glotic 2 

1 University of Maribor, Faculty of electrical engineering and computer science, Smetanova 17, SI-2000 Maribor, 

Slovenia 

2 Holding Slovenske elektrarne Group, Koprska ulica 92, SI-1000 Ljubljana, Slovenia 

Abstract — Energy storage is a physical storage for energy, like Batteries, Flywheels, Compressed Air Storages, Pumped 

Storages, etc. This paper presents the use of the optimization algorithm in order to achieve the optimal usage of Energy Storage. 

Reservoirs of cascade Hydro Power Plants have been used as model of Energy Storage, and these are known as complex 

optimization problems. Optimization algorithm used in this paper was the adapted differential evolution algorithm. 

Index Terms — Differential evolution, energy storage, optimization, hydro power plants. 


Energy storage [1] is a physical storage for energy and 

can be found in different types. Authors’ research has 

been focused to cascade hydro power plants (HPP), where 

each individual plant has its own reservoir and energy 

storage, respectively. 

Various combinations of reservoirs’ charging and 

discharging produces different amount of electricity. In 

order to achieve optimal production, several methods can 

be implemented [2], such as Lagrangian relaxion and 

Benders decomposition-based methods, Mixed-integer 

programming, Dynamic programming, Evolutionary 

Computing Methods, Artificial intelligence methods and 

Interior-point methods. 

Differential Evolution (DE) Algorithm [3] is an 

efficient and robust global optimization algorithm and 

therefore it has been selected in this paper as an 

appropriate optimization technique. 

Short-term optimization using DE with self-adaptive 

parameter settings authors in [4] has been used on four 

cascades HPP, where the best objective value has reached 

after 2000 generations. The modified DE presented in [5] 

includes penalty factor during the objective function 

evaluation, which preserves the satisfied final reservoirs 

levels of four cascades HPP. In [6] authors combined 

advantages of the two modified DE algorithms, where the 

grouping and shuffling operation is carried out over the 

population periodically. 

Optimization of reservoirs scheduling HPP is known as 

a complex problem, where large number of HPP in 

cascade, means much larger number of reservoirs 

scheduling combinations and convergence time, 

respectively. The main goal of this paper was to modify 

DE in order to be capable of reaching the global optimal 

solution with fast convergence. This means the adequate 

distribution of individual HPP electrical energy 

production by scheduling reservoirs in order to satisfy the 

demand for 24 hours. Besides satisfying the demand, the 

decreased usage of water quantity per electrical energy 

unit (m 3 /MWh) has to be also achieved. Also, the 

optimization results must be feasible in range of couple 

minutes. 

Mathematical model of cascade hydro power plants is 

E-mail: arnel.glotic@uni-mb.si 

described in section II, standard and modified differential 

evolution algorithm in section III, results in section IV, 

and conclusion in section V. 

II. MATHEMATICAL MODEL OF CASCADE HYDRO POWER 

PLANTS 

The mathematical model describes cascade HPP on 

Drava River in Slovenia, owned by Dravske elektrarne 

Maribor (DEM). DEM is a subsidiary company of 

Holding Slovenske elektrarne (HSE), which is the biggest 

producer and trader with electricity in Slovenia. DEM 

provides approximately 25.5% of produced energy in 

Slovenia, with maximum output of 587 MW. 

The mathematical model consists of eight cascades, 

t 

where i-th HPP has natural inflow Qi ,NI in the observed 

hour t of the day. The first HPP in decade structure has 

t 

the inflow Qi,I of Drava River coming from Austria. The 

source of Drava River lies in Italy, near Austrian-Italian 

border. 

The total inflow for the first HPP in the observed hour t 

is, 

t t t 

Qi,TI Qi,I Qi,NI 

, (1) 

i 1, t 1,2,...24 

t 

where Qi ,TI is the sum of inflows. The total inflow for the 

following seven HPP is expressed as 

t t t 

Qi,TI Q( i1) Qi,NI 

i 2,3,...6, t 1,2,...24 , (2) 

t 

Q is the outflow of the upper HPP, expressed as 

where ( i 1) 

t t t 

i i,T i,O 

Q Q Q 

i 1,2...8, t 1,2,...24 , (3) 

which represents the sum of the flow through the turbine 

and the overflow in the observed hour t. The last two 

HPP’s, HPP 7 and HPP 8, are canal based type HPP’s 

where flows merge with the riverbed at the end of the 

canal. Both of these HPP’s have the required biological 

minimum flow Q i,B 

, which must be provided to the 

riverbed.

t 

Q1,TI 

V 

t 

HPP1 

t 

Q1,O 

HPP 1 

dam 

t 1 

H V 

t 

Q1,T 

t 

Q1 

V2,min 

Total inflow for the last two HPP in chain is expressed as 

t t t 

Qi,TI Q( i1) Qi,NI Qi,B 

(4) 

i 7,8, t 1,2,...24 . 

Inflow water can be used for charging reservoir up to the 

maximal reservoir height V i,max 

or used in combination 

with flow gained from discharging reservoirs. However it 

must be considered that in the observed hour t the 

reservoirs values 

t 

V i must be between minimal or 

maximal allowed value of the individual reservoir. All the 

reservoirs also have the prescribed maximal discharging 

value. 

The hydro generator output power is expressed as 

 

t 

i i,1 

t 

i 

2 

i,2 

t 

i 

2 

i,3 

t 

i 

t 

i i,4 

t 

i 

ci,5 t 

Qi ci,6 

P c V c Q c V Q c V 

,(5) 

 

where c represents the hydropower generation 

t 

coefficient. In cases where the inflow Q i,TI 

is larger than 

the maximal allowed flow through the turbines of the i-th 

HPP and the reservoir level t 

V i reaches the maximal 

t 

value allowed, then the overflow Q i,O 

is unavoidable and 

it can be expressed as 

t t t t 

Qi,O Qi,TI Qi 

Pi,max , 

i 1,2,...8, t 1,2,...24 

(6) 

t t 

where i i,max 

 

Q P is the flow throughout the turbines, 

which provides the maximal output power. Power 

generation consider also the head effect, 

t t t 

Hi HVi Hi,O 

i 1,2...8, t 1,2,...24 

, 

(7) 

t 

where H i is the difference between the inlet and outlet 

t 

H V i 

t 

is the level of reservoir at volume V i and 

head, 

t 

i,O 

H is the level of the outlet. Both levels are expressed 

with the polynomial of the sixth degree: 


V2,max 

t 

t 

V Q 

2 

2,TI 

Figure 1: Layout of two hydropower plants 

t 

Q2,O 

HPP 2 

dam 

t 2 

H V 

t 

Q2,T 

 

 

3 

t 

2 

t 

1 

t 

6 5 4 

t t t t 

i,O i,1 i i,2 i i,3 i 

H k Q k Q k Q 

k Q k Q k Q k 

i,4 i i,5 i i,6 i i,7 

t 

Q2 

, (8) 

where ki are the coefficients of the polynomial obtained 

by experimental measurements of each reservoirs and 

provided by DEM personnel. 

III. OPTIMIZATION ALGORITHM 

Differential evolution (DE) algorithm has been used as 

effective global optimizer and was proposed by R. Storn 

and K. Price [3]. The main steps of DE algorithm are 

initialization, mutation, crossover, evaluation and 

selection. The initialization step is defined as a randomly 

chosen population. Each individual xi of the initial 

population is composed of j variables: 

x jG , x j,upp rand(0,1) ( x j,upp xj,low 

) 

(9) 

j 1,2..., D 

xiG 

, x1, x2,... xD 

(10) 

i 1,... NP 

where UPP x j,upp 

and LOW j,low 

x are upper and lower 

bounds defined for each variable x j , G denotes 

generation, NP number of population, D number of 

parameters or problem dimension and i the number of the 

population member and individual, respectively. The 

population size depends on number of the problem 

variables D and parameters of the objective function, 

respectively. 

For the proposed mathematical model the optimization 

algorithm has upper and lower bounds defined as minimal 

and maximal value of the individual reservoirs. 

Therefore, after the initialization, the population is 

composed of NP D-dimensional vectors: 

1 24 1 24 

xiG 

, Vi,1 ,... Vi,1 ,... Vi,8 ,... V 

i,8 

 

(11) 

i 1,2... NP 

where V is the volume of individual HPP reservoir in 

time t. At the initialization step of DE the volumes are 

randomly chosen for each individual HPP and for each 

individual hour in 24 hour period. Therefore the

dimension of the problem D is 192 and the population 

size is five times larger. Therefore the population size is 

960. 

The mutation stem if followed after the initialization 

step. For each target individual and sometimes referred to 

as vector x iG , , the mutant vector is created according to 

the selected strategy. The applied strategy in this paper is 

formulated as 

viG , xiG , Fxbest, GxiG , Fxr 1, Gxr2, G 

, 

i 1,2..., NP 

(12) 

where xr and x 

1 r are randomly chosen individuals from 

2 

interval [1,NP], x best,G represents the best individual of 

the generation G and F is the weight. 

The following step is crossover, where for the each 

mutant vector a new trial vector u iG , is produced via 

“binary” crossover: 

vi, j, G if rand(0,1) CR or j jrand 

 

ui, 

j, G 

 

xi, j, G if rand(0,1) CR or j jrand 

 

i 1,2..., NP, j 1,2,..., 

D 

(13) 

where CR is crossover constant selected by the user. The 

j rand is a randomly chosen integer from interval [1,…D], 

which ensures that the trial vector obtains at least one of 

the parameters from the mutant vector. 

In the last step, known as selection, DE evaluates trial and 

target vector, commonly referred to as parent vector: 

 

iG , if f iG , f , 

 

u u xiG 

 

xiG 

, 1 

 

xiG , if f uiG , f x (14) 

iG , 

i 1,2..., NP 

where the lower objective function value occupies the 

position in next generation (G+1). This comparison is 

made for each of NP individuals and the new population 

in generation G+1 is selected and steps of DE algorithm 

start once again in the following order; mutation, 

crossover, evaluation and selection. The algorithm repeats 

all steps until one of the stopping criterions is reached. 

DE control parameters F, CR and strategy are selected 

by the user and have an important influence on the 

convergence time, global or local search and manner of 

creating new mutants. Use of the standard DE for solving 

the presented optimization problem may not always lead 

towards the global solution, regardless of the effort given 

in order to choose the adequate control parameters. The 

algorithm can be easily trapped into local optimum and 

also the convergence time can be drastically increased. In 

order to overcome these problems the modified algorithm 

uses self-adaptive F and CR. For the initial generation 

both of the control parameter are selected by the user and 

vary along with the iteration number according to (15) 

and (16): 

FRif f( xbest, G1) f( xbest, 

G) 

 

FiG 

, 1 

 

FiG , Otherwise. 

(15) 

 

i 1,2..., NP 

If the algorithm finds a better solution in generation G 


compared to generation G - 1, then a randomly selected 

FR is employed in generation G + 1. 

CRR if FiG , 1 

FiG , and rand(0,1) 

 

CRiG 

, 1 

 

CRiG , Otherwise. 

(16) 

 

i 1,2..., NP 

A random CRR in generation G+1 is also provided if a 

FR is previously employed and at the same time a 

randomly selected value from interval [0, 1] is lower than 

0.1 . The described modification loads toward the 

global solution and improves the convergence time. A 

further improvement in convergence time can be achieved 

by parallel computation. 

The presented optimization problem is a multiobjective 

problem [7], where three different objectives 

are merged into a single one by using the weighted sum 

method [8]. The first goal of the optimization process is 

the satisfied demand for 24 hours by scheduling 

reservoirs of cascade HPP. The satisfied demand should 

be followed by the decreased usage of water quantity per 

3 

electrical energy unit ( m MWh) which represents the 

second objective. The third objective represents the 

decreased and eliminated overflow, respectively. The 

objective function for each individual objective is 

expressed as: 

2 

24 8 

 

 

t t 

1 demand i,opt 

 

 

 

 

 

t1 i1 

 

1 

f W W 

 

24 

 

8 24 

t Qi,T 

 

 

i1 t1 

 

2 

 

8 24 

t Wi,opt 

 

 

i1 t1 

 

8 24 

t 

3 i,O 

 

i1 t1 

f 

f Q 

 

(17) 

t 

Demand energy Wdemand and optimal production energy 

t 

i,opt 

W is formulated as a product of power P and time t 

W PtWh (18) 

The unified objective function f is defined as 

f f1w1 f2w2 f3w3 (19) 

where each individual objective is normalized and 

weights are set according to the selected priority of the 

individual objective. The values selected for a given 

problem were 0.6, 0.15 and 0.25, respectively. 

IV. RESULTS 

The proposed modified DE algorithm has been used in 

order to achieve globally optimal production of the 

cascade of the HPP and to satisfy the demand, 

respectively. The test data used was a real 24 hours 

demand plan from SCADA. It has been shown in Table I 

and it is valid for the observed day in the past and 

practically realized by scheduling reservoirs.

Time 


Table I: The satisfied demand by scheduling reservoirs for the dispatcher, standard and modified DE 

Demand scheduling the 

reservoirs by dispatcher 

(real data from SCADA ) 

Energy 

( MWh ) 

Satisfied demand by scheduling 

reservoirs - standard DE 

Satisfied demand by scheduling 

reservoirs - modified DE 

Water discharge Energy Water discharge Energy Water discharge 

( /h ) ( MWh ) ( / h ) ( MWh ) ( / h ) 

1 19.8 828000 0 0 19.0 738446 

2 6.8 349200 6.0 284205 7.0 218980 

3 0 0 0 0 0 0 

4 0 0 0 0 0 0 

5 0 0 0 0 0 0 

6 0 0 3.0 154185 0 0 

7 89.2 2080800 39.0 1372293 89.0 2106432 

8 368.6 8355600 368.1 9151060 368.6 8696690 

9 417.5 10018800 416.4 10288591 417.5 10553592 

10 315.5 7592400 314.7 6764925 315.5 8525579 

11 244.2 5968800 244.0 5853691 244.3 5827843 

12 313.3 7408800 312.7 7426771 313.3 7453704 

13 322.4 7740000 321.9 7392661 322.4 7548595 

14 332.7 8089200 332.4 8707231 332.6 7625784 

15 320.1 7776000 319.6 7223638 320.2 8137026 

16 306.7 7315200 306.4 6806393 306.7 6557976 

17 403.3 9518400 403.2 9281914 403.3 9743537 

18 397.2 9378000 396.7 9344417 397.1 9186977 

19 391.9 9381600 391.4 9250750 391.9 9260050 

20 306.5 7452000 306.6 6604082 306.5 7079875 

21 290.1 7185600 289.6 6610318 290.0 7440151 

22 272.3 6494400 272.2 5624925 272.3 5849220 

23 265.2 6091200 265.1 6813626 265.2 5808105 

24 249.0 5763600 248.6 5207242 248.9 5301717 

Total 5632.2 134787600 5557.4 130162918 5631.3 133660279 

The scheduling has been made by the dispatch personnel 

of the DEM Company. This data has been used as a 

reference followed by optimization algorithm – the DE 

and the modified DE – with the objective to satisfy the 

given demand by determining the optimal production of 

individual HPP during the 24 hour period. According to 

the results from Table I, the modified DE compared to 

manual dispatch saved approximately 1.12 million 

3 

m of 

water, which equals to approximately 50 MWh less of 

potential energy used. 

Authors [8] showed DE’s control parameters impact on 

convergence and global optimization performance. By 

using smaller F values, a local optimum can be reached 

faster, while a global one can be reached by choosing 

larger values. Selection of larger CR values can reduce a 

convergence time. In order to improve algorithm’s 

performance on a given optimization problem, a search 

for suitable DE control parameters was not successful, 

although the best result have been obtained by using the 

following parameters F = 0.5, CR = 0.8 and strategy = 3. 

However, the standard DE was not able to achieve the 

global optimum and to minimize the difference between 

the demand and the production down to zero, 

respectively. 

Authors [9] have shown the benefits of self-adaptive 

parameters control. This was a solid research direction in 

this paper and in order to achieve the global optimum, the 

modified DE with self-adjusting F and CR has been 

proposed. The modified DE (Fig. 2) stopped the 

evolution process after 200 generations with the 

convergence time of 280 seconds, while the standard DE 

(Fig. 3) stopped after 1000 generations and 1050 seconds. 

The stopping criterion in this case was 500 generations 

without of any change in objective function value. 

The final result at the end of the optimization process by 

using the modified DE algorithm is an optimal 24 h 

production of each individual HPP. Such a production 

completely satisfies the demand. The optimal 24 hours 

production of individual HPP is shown in Fig.4 and the 

corresponding reservoir volumes during the 24 hours 

period are shown in Fig. 5.

Objectives 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

0 

0 20 40 60 80 100 120 140 160 180 200 

Generation 

Figure 2: Convergence of the unified and three individual 

objective functions values by using the modified DE 

Electrical Enegy Production [MWh] 

450 

400 

350 

300 

250 

200 

150 

100 

50 

HPP1 

HPP2 

HPP3 

HPP4 

HPP5 

HPP6 

HPP7 

HPP8 

Demand 

0 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 

Time [h] 

Figure 4: Optimal production of individual HPP proposed 

by the modified DE 

Vmax 

Volume [m 3 ] 

HPP 1 

HPP 2 

HPP 3 

HPP 4 

HPP 5 

HPP 6 

HPP 7 

HPP 8 

Vmin 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 

Time [h] 

Figure 5: Charging and discharging of reservoirs during 

the optimal production of individual HPP 

V. CONCLUSION 

The modified DE algorithm in this paper was capable 

of solving complex optimization problem. As shown on 

the presented optimization problem, the algorithm was 

f 

f 

1 

f 2 

f 3 


Objectives 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

0 

0 100 200 300 400 500 600 700 800 900 1000 

Generation 

Figure 3: Convergence of the unified and three individual 

objective functions values by using the standard DE 

able to satisfy the demand along with the fast 

convergence speed. The optimization problem was 

observed in period of 24 hours and includes 8 HPP’s. 

Therefore it has 192 variables that need to be identified. 

Despite this fact, the algorithm ensures the exploration of 

large solution space in time span of several minutes and 

provides the global solution. The dispatch personnel is 

unable to explore such a large solution space and the 

production of individual HPP is determined on the basis 

of previous experiences and therefore, non-optimally. By 

using the modified DE algorithm, the dispatch personnel 

obtain the guide which improves the overall production 

efficiency. The algorithm can also be used to indicate 

whether a given demand plan is feasible. 

REFERENCES 

[1] S. Vazquez, S. M. Lukic, E. Galvan, L. G. Franquelo and J. M. 

Carrasco “Energy Storage System for Transport and Grid 

Applications”, IEEE Transactions on industrial electronics, Vol. 

57, pp. 3881-3895, December 2010. 

[2] I. A. Farhat, M.E. El-Hawary, “Optimization methods applied for 

solving the short-term hydrothermal coordination problem,” 

Electric Power System Research, pp. 1308-1320, 2009. 

[3] R. Storn, K. Price, “Differential Evolution – A simple and 

efficient adaptive scheme for global optimization over continuous 

spaces,” Journal of Global Optimization, pp. 341-359, 1997. 

[4] X. Yuan, Y. Zhang, L. Wang, Y. Yuan, “An enhanced differential 

evolution algorithm for daily optimal hydro generation 

[5] 

scheduling,” Computers and Mathematics with Applications, pp. 

2458-2468, 2008. 

L. Lakshminarasimman, S. Subramanian, “Short-term scheduling 

of hydrothermal power system cascaded reservoirs by using 

modified differential evolution,” IEEE Proc.-Gener. Transm. 

Distrib., Vol. 153, pp. 693-700, 2006. 

[6] Y. Li, J. Zuo, “Optimal Scheduling of Cascade Hydropower 

System Using Grouping Differential Evolution Algorithm,” 

International Conference on Computer Science and Electronic 

Engineering, pp. 625-629, 2012. 

[7] J. Grobler, A.P. Engelbrecht, V.S.S. Yadavalli, “Multi-objective 

DE and PSO Strategies for Production Scheduling,” IEEE 

Congres on Evolutionary Computation, pp. 1154-1161, 2008. 

[8] R. Gamperle, S.D. Muller, P. Koumoutsakos, “A parameter Study 

for Differential Evolution,” Conf. on Adances in Intelligent 

System, Fuzzy Systems, pp. 293-298,2002. 

[9] J. Brest, V. Zumer, M.S. Maucec, “Self-Adaptive Differential 

Evolution Algorithm in Constrained Real-Parameter 

Optimization,” IEEE Conges on Evolutionary Computation, pp. 

215-222, 2006. 

f 

f 

1 

f 2 

f 3


Adaptive Surrogate Approach for Bayesian 

Inference in Inverse Problems 

M. Neumayer∗ ,H.R.B.Orlande ‡ ,M.J.Colaço ‡ , D. Watzenig∗ ,G.Steiner∗ , B. Brandstätter † ,andG.S. 

Dulikravich § 

∗Institute of Electrical Measurement and Measurement Signal Processing, Graz University of Technology, Graz, 

Austria, † Elin Motoren GmbH, Elinmotorenstrasse 1, A-8160 Preding/Weiz, Austria, ‡ Department of Mechanical 

Engineering, Federal University of Rio de Janeiro, UFRJ Rio de Janeiro, RJ, Brazil, § Department of Mechanical 

and Materials Engineering, Florida International University Miami, Florida, U.S.A. 

E-mail: neumayer@TUGraz.at 

Abstract—Bayesian inference forms a flexible and versatile solution strategy for inverse problems. Its advantage lies in the 

straight forward formulation of the solution process, the ability to incorporate any existing knowledge, as well as in the 

output of the method itself, which provides statistical knowledge about unknown parameters. The costs of the mentioned 

benefits are often largely increased numerical efforts due to the use of sampling methods. This especially holds if the 

underlying physical problem requires the solution of a partial differential equation. In this paper we present a simple, yet 

versatile and effective strategy to accelerate Bayesian inference using an adaptive surrogate approach. 

Index Terms—surrogate technique, adaptive, Bayesian inference, MCMC 


Inverse problems and parameter estimation problems 

belong to the class of indirect measurement problems 

where one tries to estimate a parameter vector x ∈ R N 

from observations ˜ d ∈ R M [1]. They arise in many 

disciplines of engineering and science. The term inverse 

problem is most often associated with imaging techniques 

like electrical capacitance/impedance tomography (ECT 

and EIT), or computed tomography, but their mathematical 

common is their inherent ill-posed nature. Parameter 

estimation has not that massive association with imaging 

like inverse problems, but is in many ways of even larger 

importance in engineering. Such an example is given by 

the determination of material parameters from ”simple” 

a simple measurements setup. 

Formally the physical measurement process P can be 

denoted by P : x ↦→ ˜ d. Hereby ˜ d is the corrupted version 

of the otherwise noise free measurements d. Formost 

practical examples an additive noise model of form ˜ d = 

d + v is valid, where v ∈ R M follows a certain noise 

distribution described by a probability density function 

(pdf). The modern model based approach to estimate x 

from ˜ d maintains a model F : x ↦→ y (y ∈ R M )whichis 

referred to as forward map. For most real world problems 

this is a computer model solving the underlying partial 

differential equations (PDEs) for P in a numerical way. 

In this paper we will assume that P = F holds. 

Classical deterministic inversion methods then manipulate 

the vector x in order to minimize some useful norm 

of the residual vector e = y − ˜ d. In addition ill-posed 

problems require a regularization term for the numerical 

stabilization of such an optimization problem. The single 

result of the approach is referred to as point estimate, 

which we will denote by xMAP (maximum a posteriori) 

as the value of x which provides the smallest misfit. 

A contrastable approach to solve inverse problems is 

provided by the framework of Bayesian inference [2]. 

Rather than providing a single result, Bayesian inference 

approaches provide the summary distribution π(x| ˜ d) (the 

posterior distribution, MAP). Out of this any statistics, 

like mean, variance, correlations, MAP estimates, etc. 

about x can be computed. The cost for this gain in 

information are the increased computational costs, as 

the approach requires numerous evaluations of F .This 

especially holds for the case that Markov chain Monte 

Carlo (MCMC) methods are applied. Thus, the practicability 

for the application of Bayesian methods is limited 

if the evaluation of F requires computational expensive 

operations like the numerical solution of PDEs. 

In this paper we will present a simple and versatile 

strategy to speed up Bayesian inference for inverse problems 

and parameter estimation problems. The speed up 

is provided by the use of an approximation or surrogate 

model [3]. The paper is structured as follows. In section 

II we will introduce the theory about Bayesian inversion 

and the exploration of the posterior distribution by the 

Metropolis Hastings (MH) algorithm. In section III we 

explain an acceleration approach for the MH which is 

based on the use of approximations. Finally we will 

present a numerical example where we estimate thermal 

material parameters from a heated slab. 

II. BAYESIAN INFERENCE AND MARKOV CHAIN 

MONTE CARLO 

In this section we will present the framework of 

Bayesian inversion for the solution of inverse problems. 

Having measurements ˜ d from a measurement process P 

and a model F to simulate P , the solution process is

marked by the use of Bayes law [1] 

π(x| ˜ d)= π(˜ d|x)π(x) 

π( ˜ ∝ π( 

d) 

˜ d|x)π(x). (1) 

The law connects the so called likelihood function 

π( ˜ d|x) and the prior π(x) to formulate the posterior 

distribution π(x| ˜ d). π( ˜ d) is termed the evidence and 

has the role of a normalization constant to ensure the 

property 

RN π(x| ˜ d)dx =1of a pdf. Hence, it can be 

skipped leading to the right hand formula in equation (1). 

The likelihood function π( ˜ d|x) provides the probability 

measure for x causing the data ˜ d given the model and 

statistical knowledge about the measurement noise. For 

an additive noise model ˜ d = d + v, the likelihood is 

given by π( ˜ d|x) =πv(y − ˜ d), whereyisthe output 

of the forward map. For many practical problems zero 

mean white Gaussian noise, i.e. v ∝N(0, Σv), where 

Σv is the covariance matrix, can be assumed. Then the 

likelihood function becomes 

π( ˜ 

d|x) ∝ exp − 1 

 

y − 

2 

˜ T 

−1 

d Σ y − ˜ 

d 

 

, (2) 

where Σ is set to Σv. 

The prior π(x) provides a probability measure about x 

being the solution. While the design of the likelihood has 

to follow strict mathematical rules due to its definition 

the prior provides a very flexible way to incorporate 

expert knowledge about x. I.e. if we know that the ith 

component of x has a lower and an upper bound the 

corresponding prior is given by the uniform distribution 

xi ∝U(xi,min,xi,max). For the case that a mean value 

about the j-th component is known a Gaussian distribution 

xj ∝N(μxj ,σxj ) can be used to express the prior 

where σxj controls the deviation. 

The posterior π(x| ˜ d) expresses the probability for x 

being the solution given the data ˜ d, the model and the 

prior. Rather than a single result, the posterior covers 

all possible solutions. For a post analysis of π(x| ˜ d) one 

could look at a specific realization of x and evaluate its 

probability. However, as this procedure is not of big use 

some meaningful point measures have become popular. 

One of them is the maximum a posteriori (MAP) estimate 

xMAP =argmaxx π( ˜ d|x), which is the mode of the 

posterior. The other one is the conditional mean (CM) 

estimate 

 

xCM = xπ( ˜ d|x)dx, (3) 

R N 

which summarizes the complete distribution. It can be 

easily seen, that the MAP estimate can be found by 

solving an optimization problem by either maximizing 

the posterior, or minimizing its logarithm. This corresponds 

to classical regularized approaches except, that 

the likelihood introduces statistical knowledge about the 

noise. This fact results in generally higher modeling 

efforts when using Bayesian methods. The CM estimate 

requires the evaluation of a high dimensional integral. 

An analytic solution of the integral is often not possible, 


as the integral is of high dimension and also because 

of the complicated interaction of the forward map. Also 

standard numerical schemes like the well known Gauss 

quadrature cannot be applied for such integrals, due to 

the lack of knowledge about the support. The numerical 

tool to solve such integrals is known as Monte Carlo 

integration. Hereby a set of samples x (N) from the posterior 

is generated, where the frequency of the samples 

follows the target distribution. Then the CM integral can 

be approximated by 

xCM = 

 

R N 

xπ( ˜ d|x)dx ≈ 

N 

i=1 

x (N) 

i . (4) 

In the same way any other integral (also about functions 

of x) can be solved. The generation of samples from 

a distribution belongs to the discipline of computational 

Bayesian inference and will be discussed in the following 

subsection. 

One important aspect about the Bayesian framework 

which was not stated so far is the possibility to treat 

nuisance parameters ν in the same way as the state vector 

x. I.e.iftheforwardmapF is in fact a function Fν(x) 

it is possible to do inference about both, x and ν in 

the same natural way. This can be used if parameters 

of a measurement system are unknown or provide an 

uncertain factor. 

A. The Metropolis Hastings (MH) Algorithm 

Algorithms for practical computational Bayesian inference 

are typically sampling algorithms [4]. They can be 

seen as random number generators which compute independent 

samples from an arbitrary target distribution. For 

inverse problems the target distribution is the posterior. 

On a discrete state space the frequency of certain samples 

corresponds to the probability of the sample, enabling the 

powerful tool of Monte Carlo integration. As can be seen 

by equation (2), the evaluation of π( ˜ d|x) requires one 

evaluation of the forward map F . This already indicates 

the fact, that sampling methods result in generally higher 

computational cost. For computational inference a class 

of algorithms termed MCMC methods were developed, 

as they rely on an underlying Markov chain X. TheMH 

algorithm [5] is one prominent example out of this class 

of methods. The algorithm works as the following: 

1) Pick the current state x = Xn from the Markov 

chain. 

2) With proposal density q(x, x ′ ) generate a new 

state x ′ . 

3) Compute α = min 1, π(x′ | ˜ d)q(x ′ ,x) 

π(x| ˜ d)q(x,x ′ 

. 

) 

4) With probability α accept x ′ and set Xn+1 = x ′ , 

otherwise reject x ′ and set Xn+1 = x. 

Starting from the current state x of the Markov chain 

(line 1) X the MH algorithm generates a proposal 

candidate x ′ (line 2) using the proposal kernel q(x, x ′ ). 

Then the acceptance ration α is evaluated in line 3 for the 

proposal x ′ , which requires one evaluation of the forward

map. If the proposal is accepted it becomes the new state 

of the Markov Chain, otherwise it gets rejected. The 

rejection of proposal candidates is critical with respect 

to the computational efficiency of the MH algorithm, as 

a high rejection rate, leads to a large number of forward 

map evaluations without generating a new state. This is 

strongly affected by the proposal kernel q(x, x ′ ) which 

drives the exploration of the posterior distribution. 

III. ACCELERATION OF THE MH USING SURROGATES 

The strategy we use to speed up the classical MH 

algorithm is based on the use of an approximation or 

surrogate F ∗ [6]. An approximation F ∗ has a considerable 

lower runtime with respect to F but at the cost 

of an approximation error e = y − y∗ . Subsequently 

we introduce the likelihood function π∗ ( ˜ d|x) to indicate 

the use of F ∗ . Then the delayed acceptance Metropolis 

Hastings (DAMH) algorithm [7] is given by 

1) Pick the current state x = Xn from the Markov 

chain. 

2) With proposal density q(x, x ′ ) generate a new 

state x ′ . 

3) Compute α = min 1, π∗ (x ′ | ˜ d)q(x ′ ,x) 

π∗ (x| ˜ d)q(x,x ′ 

. 

) 

4) With probability α accept x ′ to be a proposal for 

the standard MH algorithm. Otherwise set x ′ = x 

and return to 2. 

5) Compute β = min 1, π(x′ | ˜ d)q(x ′ ,x) 

π(x| ˜ d)q(x,x ′ 

. 

) 

6) With probability β accept x ′ and Xn+1 = x ′ , 

otherwise reject x ′ and set Xn+1 = x. 

As can be seen, the DAMH algorithm consists of two 

nested MH algorithms (in the original MH algorithm 

step 3 and 4 do not exist). The DAMH tries to gain 

its advantage from a pre-evaluation of the proposal candidates 

x ′ on the distribution π∗ (x ′ | ˜ d). An evaluation 

of π(x ′ | ˜ d) in the inner MH is only performed if the 

proposal is accepted in the outer MH. In this sense the 

outer MH algorithm of the DAMH can be seen as a filter 

for bad proposals or as an improved proposal generator. 

It is obvious that the gain in performance gain strongly 

depends on the difference between π∗ (x ′ | ˜ d) and π(x ′ | ˜ d). 

An interesting point about the DAMH is the availability 

of the deterministic approximation error e = y − y∗ in line 5. This knowledge can be used to improve the 

algorithm by two points: 

• Learn about the approximation error to adapt the 

likelihood π∗ (x ′ | ˜ d). 

• Adapt the approximation F ∗ to improve the quality. 

The approach to incorporate knowledge about the approximation 

error is referred to as enhanced error model 

(EEM) [1]. Hereby the deterministic approximation error 

e is treated as a random variable. Mostly a Gaussian 

distribution about e is assumed, describing the error as 

e ∝N(μe, Σe). Then the likelihood function π∗ (x ′ | ˜ d) 

becomes π∗ ( ˜ d|x) ∝ 

 

exp − 1 

 

y 

2 

∗ + μe − ˜ T 

−1 

d Σ y ∗ + μe − ˜ 

d 

 

, 

(5) 


where Σ is the sum of Σv and Σe. In its original idea the 

distribution N (μe, Σe) is computed using samples over 

the prior π(x). However, with the availability of current 

value of en in the DAMH an adaptive approximation 

error model can be built by [8] 

μe,n = 1 

(n − 1)μe,n−1 + en , (6) 

n 

Ce,n = Ce,n−1 + ene T n , (7) 

1 

Σe,n = (n − 1)Ce,n − nμ 

n − 1 

e,nμ T 

e,n . (8) 

Due to this the likelihood π∗ ( ˜ d|x) adapts to the posterior 

during the runtime, which means that no sampling of 

π(x) is necessary to built N (μe, Σe) in the priming of 

the solution process for the data ˜ d. 

The second point addresses the possibility to use 

the knowledge about e to improve the quality of the 

approximation F ∗ during the runtime. A considerable 

simple update is possible if F ∗ is of form y ∗ = Pxa. 

Hereby xa denotes the augmented state vector, which 

holds x in an adequate form, i.e. arbitrary functions 

of the components of x or additional variables like the 

simulation time for transient problems. 

Thus, the approximation can be turned nonlinear with 

respect to x but it is linear with respect to the elements of 

P . This is important, as for this class of approximations 

a number of update algorithms exist. In this work we use 

the least mean squares (LMS) algorithm given by [9] 

P n+1 = P n + γenx T a,n , (9) 

where γ is a step width parameter known as adaptation 

coefficient. The simpleness of the LMS update provides 

almost no computational costs and helps to improve 

the quality of the the approximation F ∗ for the 

posterior distribution. Again the initial matrix P can 

be determined by samples over the prior distribution 

π(x). For this the overdetermined equation system 

X aP T = Y has to be assembled and solved, where the 

matrix X a holds the augmented state vectors from the 

samples, and Y contains the exact solutions evaluated 

by F . There is also the possibility to run the standard 

MH for some time to learn about P and then switch 

to the DAMH. The choice of the adaptation parameter 

γ affects the learning speed of the LMS algorithm. For 

stability reasons γ has an upper limit which depends on 

the problem and can only be derived under restrictive 

conditions. However, as an MCMC algorithm provides a 

enormous number of evaluations it is less critical to set 

μ to a small value, as even this provides an improvement 

(although slower) to the approximation P and the LMS 

algorithm operates in a stable state. 

To use both, the update of the approximation y ∗ = 

Pxa and the adaptive error model, the approximation 

should be reevaluated for the current state vector x. 

This requires a second evaluation of F ∗ , but this is 

computational cheap due to the design of F ∗ .

IV. A NUMERICAL EXAMPLE 

To demonstrate our approach on a numerical example 

we consider an indirect measurement problem where we 

want to estimate thermophysical properties of a slab 

from a transient heat transfer experiment. We consider 

a slab of length L which we model by means of a 

1D simulation in the domain Ω : 0 ≤ x ≤ L. The 

slab is initially at the uniform temperature ϑ0. Onthe 

left side (x = 0) a uniform heat flux J is applied 

by an electric heater. On the right side at x = L the 

temperature ϑ(L, t) is measured over time. The heat on 

this side is exchanged by convection with the surrounding 

media at the temperature ϑ0. This exchange depends on 

a heat transfer coefficient α in Wm −2 K −1 .Thereareno 

heat sources within the medium and the thermophysical 

properties are supposed constant in the first assumption. 

The mathematical formulation for this heat conduction 

problem is given by: 

1 dϑ 

k dt = ∂2ϑ ∂x2 −λ 

in 0 0 (11) 

∂ϑ 

∂x + αϑ = αϑ0 at x = L, fort>0 (12) 

ϑ = ϑ0 for t =0,in0

σ e 


TABLE I 

SUMMARY OF THE RESULTS FOR THE LINEAR CASE. 

Nr. Experiment σv 

K 

μλ 

W 

mK 

σλ 

W 

mK 

μα 

W 

m 

σα μk1 

σk1 

Tsim,r 

2K W 

m2K Ω 

m2 Ω 

m2 true 0.12 11 4.5 × 10 

% 

−3 

1 F 0.1 0.12 6.6 × 10−3 11.4 0.68 4.6 × 10−3 2.3 × 10−4 100 

2 F ∗ 1 0.1 0.13 6.3 × 10−3 10.7 0.59 4.3 × 10−3 1.7 × 10−4 3 F 

75 

∗ 2 0.1 0.12 8.4 × 10−3 10.2 0.23 4.3 × 10−3 6.0 × 10−4 4 F 

12 

∗ 3 0.1 0.12 4.3 × 10−3 12.1 0.29 4.6 × 10−3 2.3 × 10−5 5 F 0.5 0.13 7.0 × 10 

12 

−3 12.7 1.72 4.7 × 10−3 4.2 × 10−4 100 

6 F ∗ 1 0.5 0.13 5.9 × 10−3 9.8 1.42 4.2 × 10−3 3.5 × 10−4 7 F 

110 

∗ 2 0.5 0.13 3.1 × 10−3 10.5 1.10 4.4 × 10−3 2.9 × 10−4 8 F 

44 

∗ 3 0.5 0.13 3.3 × 10−3 10.5 1.05 4.4 × 10−3 2.6 × 10−4 44 

TABLE II 

SUMMARY OF THE RESULTS FOR THE NONLINEAR CASE. 

Nr. Experiment σv 

K 

μλ 

W 

mK 

σλ 

W 

mK 

μα 

W 

m 

σα μk1 

σk1 

μk2 

σk2 

Tsim,rel 

2K W 

m2K Ω 

m2 Ω 

m2 W 

mK2 W 

mK2 true 0.12 11 4.5 × 10 

% 

−3 0.02 

1 F 0.1 0.123 5.5 × 10−3 12.0 0.31 4.8 × 10−3 1.1 × 10−4 0.013 4.6 × 10−3 100 

2 F 0.05 0.127 3.9 × 10−3 11.3 0.19 4.6 × 10−3 7.1 × 10−5 0.015 3 × 10−3 100 

3 F 0.5 0.130 9.8 × 10−3 13.4 1.42 5.2 × 10−3 4.3 × 10−4 0.018 11 × 10−3 100 

4 F ∗ 1 0.1 0.129 4.4 × 10−3 10.7 0.27 4.4 × 10−3 1.1 × 10−4 0.016 5.6 × 10−3 5 F 

97.6 

∗ 1 0.5 0.128 5.8 × 10−3 11.9 1.24 4.8 × 10−3 3.7 × 10−4 0.012 9.4 × 10−3 6 F 

140 

∗ 3 0.1 0.122 3.1 × 10−3 11.8 0.22 4.7 × 10−3 8.1 × 10−5 0.019 4.6 × 10−3 7 F 

20.9 

∗ 3 0.5 0.128 5 × 10−3 10.8 1.29 4.6 × 10−3 4.2 × 10−4 0.017 9.4 × 10−3 72.5 

0.12 

0.1 

0.08 

0.06 

0.04 

0.02 

0 

20 

15 

10 

dt (s) 

5 

10 

8 

6 

# FE 

(a) Standard deviation of e over 

the discretization for F ∗ 1 . 

4 

2 

18000 

16000 

14000 

12000 

10000 

Fig. 1. Approximation error e for F ∗ 1 and F ∗ 2 

8000 

6000 

4000 

2000 

0 

−1.5 −1 −0.5 0 0.5 1 1.5 

e 

(b) Distribution (pdf) of e for F ∗ 2 . 

over the prior. 

how to analyze the results we refer to [4]. For 

the simulation we used the following parameters: 

L = 0.1 m, ρ = 1040 kgm −3 , c = 1350 Jkg −1 K −1 . 

The ambient temperature ϑ0 was set to ϑ0 = 20 ◦ C, 

the electrical current I was set to I = 100 A. We 

assumed that ϑ(L, t) is measured every 20 seconds for 

3000 s. The priors for the state vector are given by a 

Gaussian distribution with μλ = 0.13 Wm −1 K −1 and 

σλ =0.01 Wm −1 K −1 for λ and uniform distributions 

with the boundaries 1Wm −2 K −1 ≤ α ≤ 15 Wm −2 K −1 , 

0.001 Ωm −2 ≤ k1 ≤ 0.01 Ωm −2 , and 

0Wm −1 K −2 ≤ k2 ≤ 0.04 Wm −1 K −2 (nonlinear case) 

for the remaining variables. The proposal generation is 

done by randomly selecting a component of the state 

vector x. Forλ the proposal is generated from the prior 

about λ. Forα and k1 an additive Gaussian distributed 

random variable with a standard deviation being 4% of 

the range given by the prior is added to the current state. 

In the nonlinear case we only use the approximations 

F ∗ 1 and F ∗ 3 . 

Figure 2 depicts the output of the Markov chain for 

TABLE III 

BEHAVIOR OF THE CHAINS FOR THE LINEAR CASE. 

Nr. Experiment σv Acα Ac β|α Acβ τIACT 

K % % % 

1 F 0.1 16.4 X X 392 

2 F ∗ 1 0.1 18.0 75.8 13.7 1620 

3 F ∗ 2 0.1 10.3 65.8 6.8 420 

4 F ∗ 3 0.1 10.2 68.0 6.9 228 

5 F 0.5 56.3 X X 263 

6 F ∗ 1 0.5 54.4 82.9 45.1 110 

7 F ∗ 2 0.5 38.8 78.6 30.5 44 

8 F ∗ 3 0.5 38.2 78.6 29.8 37 

TABLE IV 

BEHAVIOR OF THE CHAINS FOR THE NONLINEAR CASE. 

Nr. Experiment σv Acα Ac β|α Acβ τIACT 

K % % % 

1 F 0.1 30.5 X X 57 

2 F 0.05 17.2 X X 63 

3 F 0.5 67.9 X X 132 

4 F ∗ 1 0.1 33.4 79.8 26.6 38 

5 F ∗ 1 0.5 64.6 87.1 56.3 72 

6 F ∗ 3 0.1 18.8 70.9 13.4 509 

7 F ∗ 3 0.5 56.0 86.1 48.3 121 

λ. From the histogram in figure 2(b) we can see the 

distribution. Table I and II summarize the results for the 

linear and the nonlinear case including the true values for 

the state vector x for different standard deviations σv of 

the additive measurement noise. As it can be observed, all 

estimates meet the true values with reasonable accuracy. 

This especially holds for the linear case. For the nonlinear 

case some deviations occur, but that can be linked to 

the complexity of the problem. An interesting effect can 

be seen in the standard deviations of table I. Increased 

noise levels have a stronger effect on σk1 with respect to 

the other standard deviations. In the linear case a speed 

improvement of up to a factor of 10 for the linear, and 5

λ (Wm −1 K −1 ) 

0.16 

0.15 

0.14 

0.13 

0.12 

0.11 

0 1 2 3 4 5 6 

x 10 4 

0.1 

# MCMC 

(a) MCMC output for λ. 

Fig. 2. MCMC output and analysis for λ. 

1000 

900 

800 

700 

600 

500 

400 

300 

200 

100 

0 

0.1 0.11 0.12 0.13 

λ (Wm 

0.14 0.15 0.16 

−1 K −1 ) 

(b) Histogram plot for λ. 

for the nonlinear case could be achieved. The speed up 

also depends on the noise level, as the noise has direct 

influence on π(x| ˜ d) and thus affects the proposal kernel 

is important. Table III and IV provide statistics about the 

behavior of the algorithms. For the MH, the ratio Acα 

states the percentage of accepted proposal candidates. For 

the DAMH it states the ratio of accepted proposals in 

the first step and Acβ states the overall acceptance in 

the second step. The value Ac β|α states the acceptance 

in of a proposal in the second step, given an acceptance 

in the first step. Hence, this number provides a quality 

measure for the approximation F ∗ . The almost same 

level of Ac β|α in line 3 and 4, and line 7 and 8 in 

table III indicates, that the approximation already has 

a high quality, and that no further improvement could 

be achieved by the adaption. This corresponds to the 

observations of figure 1. The tables III and IV also 

explain the larger computation times for some DAMH 

variants with respect to the MH. In this case a high value 

of Acα results in a large number of evaluations of F . 

Thus, the sum of all evaluations of F and F ∗ increases 

the pure evaluation of F only. This also indicates, that 

the proposal kernel is yet not optimal for sampling from 

the posterior. The value τIACT is referred to as integrated 

auto correlation time (IACT). It was computed with the 

methods explained in [10] and provide a measure about 

the statistical efficiency of an MCMC algorithm by the 

distance between independent samples in the Markov 

chain. As can be seen, the DAMH variants have a lower 

IACT τIACT and thus are statistically more efficient. 

In section IV we stated the linear dependence of the 

parameters due to the equations (11) and (12) and that 

this circumstance can be observed in the results. Figure 

3 depicts a scatter plot for α and k1. The correlation 

factor can be computed with 0.98. From this we can 

conclude, that the measurement system is inappropriate 

and a second spatial distributed measurement would be 

required. This is a direct conclusion from the Bayesian 

analysis - an optimization based solution is not able to 

provide such inside information. 


In this work a general approach for accelerating 

Bayesian inference for indirect measurement problems 

is presented. The approach features the use of simple 

approximations by incorporating error knowledge and 


k 1 (Ωm −2 ) 

4.9 

4.8 

4.7 

4.6 

4.5 

4.4 

4.3 

4.2 

4.1 

x 10−3 

5 

4 

9.5 10 10.5 11 11.5 12 12.5 13 

α (Wm −2 K −1 ) 

Fig. 3. The scatter plot of α and k1 indicates indicates a strong 

correlation between the variables. This indicates, that more spatial 

distributed measurements are required. 

can even be used to update approximation models during 

the runtime. The presented framework can be easily 

applied to different problem types, e.g. electrical capacitance 

tomography, to perform Bayesian inference for the 

solution of indirect measurement problems. 

REFERENCES 

[1] J. Kaipio and E. Somersalo, Statistical and Computational Inverse 

Problems, ser. Applied Mathematical Sciences. Springer, 2005, 

vol. 160. 

[2] J.M.BernardoandA.F.M.Smith,Bayesian Theory. New York: 

John Wiley & Sons, 1994 (ISBN: 0-471-92416-4). 

[3] A. Forrester, A. Sobester, and A. Keane, Engineering Design Via 

Surrogate Modelling: A Practical Guide. Wiley, 2008. 

[4] L. Tierney, “Markov chains for exploring posterior distributions,” 

Annals of Statistics, vol. 22, pp. 1701–1762, 1994. 

[5] W. Hastings, “Monte Carlo sampling using Markov chains and 

their applications,” Biometrica, vol. 57, no. 1, pp. pp. 97–109, 

1970. 

[6] H.R.B.Orlande,M.J.Colaço, and G. S. Dulikravich, “Approximation 

of the likelihood function in the bayesian technique for 

the solution of inverse problems,” Inverse Problems in Science & 

Engineering, vol. 16, pp. 677–692, 2008. 

[7] J. A. Christen and C. Fox, “Markov chain Monte Carlo Using 

an Approximation,” Journal of Computational and Graphical 

Statistics, vol. 14, no. 4, pp. 795–810, 2005. [Online]. Available: 

http://pubs.amstat.org/doi/abs/10.1198/106186005X76983 

[8] T. Cui, “Bayesian Calibration of Geothermal Reservoir Models 

via Markov Chain Monte Carlo,” Ph.D. dissertation, University 

of Auckland, 2010. 

[9] S. Haykin, Adaptive Filter Theory (4th Edition). Prentice Hall, 

Sep. 

[10] U. Wolff, “Monte Carlo errors with less errors,” Computer Physics 

Communications, vol. 156, no. 2, pp. 143 – 153, 2004. [Online]. 

Available: http://www.sciencedirect.com/science/article/B6TJ5- 

4B3NPMC-3/2/94bd1b60aba9b7a9ea69ac39d7372fc5

A 

Aleksić, Slavoljub, 73, 300 

Alotto, Piergiorgio, 267, 374 

Anastasiadis, Ioannis, 271 

Andjelic, Zoran, 167 

Arkkio, Antero, 214 

B 

Balabozov, Iosko, 59 

Bardi, Istvan, 1 

Bauernfeind, Thomas, 327, 337 

Bavastro, Davide, 101 

Belahcen, Anouar, 214 

Bellwald, Lukas, 271 

Benabou, Abdelkader, 95 

Besser, Bruno, 7 

Bielby, Steven, 248 

Bilicz, Sandor, 346 

Bíró, Oszkár, 31, 41, 144, 190, 232, 327, 

337 

Brandstätter, Bernhard, 67, 403 

Brochet, Pascal, 78 

Buchau, André, 89, 386 

Buchinger, Andreas, 271 

Burgard, Stefan, 13 

C 

Calvano, Flavio, 208 

Campana, Luca Giovanni, 171 

Canova, Aldo, 101 

Cardoso Bora, Teodoro, 267 

Chiariello, Andrea Gaetano, 357 

Ciric, Ioan R., 352 

Clénet, Stéphane, 95 

Coenen, Isabel, 198, 305 

Colaco, Marcello J., 403 

Cvetkovic, Nenad, 294 

D 

Dal Mut, Giorgio, 208 

Dessoude, Maxime, 78 

Di Barba, Paolo, 171 

Diwoky, Franz, 232 

dos Santos Coelho, Leandro, 267 

Duca, Anton, 262 

Düzgün, Bilal, 154 

Dughiero, Fabrizio, 171 

Dulikravich, George S., 403 

Dyczij-Edlinger, Romanus, 13, 19 

E 

Ebrahimi, Bashir Mahdi, 125, 131, 315 

Eidenberger, Norbert, 186 

Elistratova, Vera, 78 

Ellermann, Katrin, 144 


Author Index 

Ertl, Michael, 181, 226 

F 

Faiz, Jawad, 125, 131, 220, 315 

Farle, Ortwin, 13, 19 

Farnleitner, Ernst, 31, 41 

Ferraioli, Fabrizio, 208 

Figueiredo, William, 175 

Fonteyn, Katarzyna, 214 

Formisano, Alessandro, 108, 208, 357 

Fornieles, Jesús, 7 

Fujita, Yoshihisa, 53 

Fujiwara, Koji, 113 

Fulmek, Paul, 331 

G 

Gavrila, Horia, 352 

Gergely, Koczka, 337 

Ghorbanian, Vahid, 125 

Giaccone, Luca, 101 

Gigov, Georgi, 63 

Gjonaj, Erion, 204 

Glotic, Adnan, 398 

Glotic, Arnel, 238, 398 

Göhner, Peter, 89 

Guarnieri, Massimo, 374 

Gueorgiev, Vultchan, 59 

Guimaraes, Frederico, 160 

Gyimóthy, Szabolcs, 242, 346 

H 

Hameyer, Kay, 198, 305 

Handgruber, Paul, 190 

Hantila, Florea I., 352 

Hauck, Andreas, 226 

Hecquet, Michel, 78 

Herold, Thomas, 198 

Hinov, Krastio, 59 

I 

Iatcheva, Ilona, 63 

Igarashi, Hajime, 276, 340 

Ikuno, Soichiro, 47, 53 

Ilić, Saša, 73, 300 

Iovine, Renato, 25 

Itoh, Taku, 47, 53 

J 

Janousek, Ladislav, 262 

Jorks, Hai Van, 204 

Jüttner, Matthias, 89, 386 

K 

Kaimori, Hiroyuki, 84 

Kaltenbacher, Manfred, 181, 226 

Kamitani, Atsushi, 47, 53

Karastoyanov, Dimitar, 59 

Kastner, Gebhard, 31, 41 

Katsumi, Ryuichi, 242 

Keränen, Janne, 392 

Kettunen, Lauri, 392 

Kiss, Péter, 242 

Kitak, Peter, 238, 398 

Klomberg, Stephan, 41 

Koczka, Gergely, 327 

Kömürgöz, Güven, 154 

Kotlan, Vaclav, 321 

Kouhia, Reijo, 214 

Kraiger, Markus, 310 

Krstic, Dejan, 294 

Kunov, Georgi, 63 

L 

La Spada, Luigi, 25 

Lambert, Nancy, 1 

Lehti, Leena, 392 

Li, Min, 160 

Lichtenegger, Herbert I. M., 7 

Lowther, David, 160, 175, 248, 254 

M 

Magele, Christian, 37 

Mair, Mathias, 144 

Manca, Michele, 101 

Maricaru, Mihai, 352 

Marignetti, Fabrizio, 208 

Martone, Raffaele, 108, 208, 357 

Metzker, Isabela, 175 

Miyagi, Daisuke, 84 

Moghnieh, Hussein, 254 

Mohr, Martin, 232 

Moro, Federico, 374 

N 

Nagano, Takumi, 288 

Nakata, Susumu, 53 

Nandi, Subhasis, 315 

Neumayer, Markus, 67, 403 

O 

Offermann, Peter, 305 

Ofner, Georg, 190 

Ojaghi, Mansour, 220 

Okamoto, Yoshifumi, 113, 282, 288 

Orlande, Helcio R.B., 403 

P 

Pávó, József, 242, 346 

Perić, Mirjana, 73 

Petersson, Rickard, 1 

Piantsop Mboo, Christelle, 198 

Portí, Jorge, 7 

Preda, Gabriel, 262 

Preis, Kurt, 271, 327, 337 


R 

Raicevic, Nebojsa, 73 

Rainer, Siegfried, 144 

Ramarotafika, Rindra, 95 

Ramirez, Jaime, 160, 175 

Rasilo, Paavo, 214 

Rauscher, Michael, 89 

Rebican, Mihai, 262 

Recheis, Manes, 331 

Renhart, Werner, 37 

Rossi, Carlo Riccardo, 171 

Rubesa, Jelena, 380 

Rubinacci, Guglielmo, 208 

Rucker, Wolfgang M., 89, 386 

Ruela, Andre, 160 

S 

Sabouri, Mahdi, 220 

Sadovic, Salih, 167 

Salinas, Alfonso, 7 

Santos, Rafael, 175 

Sato, Shuji, 113, 282 

Sato, Yuki, 340 

Scharrer, Matthias, 368 

Schnizer, Bernhard, 310 

Schöberl, Joachim, 226 

Schrittwieser, Maximilian, 31 

Schweighofer, Bernhard, 331 

Shimoyama, Kouske, 84 

Sieni, Elisabetta, 171 

Silva, Elizabeth, 175 

Silvestro, John, 1 

Simioli, Marco, 101 

Smetana, Milan, 262 

Sommer, Alexander, 19 

Sonmez, Oluş, 154 

Stancheva, Rumena, 63 

Steiner, Gerald, 67, 403 

Stella, Andrea, 374 

Stermecki, Andrej, 190, 232 

Štih, Željko, 137 

Stojanovic, Miodrag, 294 

Strapacova, Tatiana, 262 

Suhr, Bettina, 368, 380 

Suuriniemi, Saku, 392 

Szabo, Zsolt, 119 

T 

Takahashi, Norio, 84 

Takbash, Amir Masoud, 131, 315 

Tamburrino, Antonello, 208 

Tarhasaari, Timo, 392 

Ticar, Igor, 238, 398 

Toledo-Redondo, Sergio, 7 

Toratani, Tomoaki, 242 

Trkulja, Bojan, 137 

Tsuburaya, Tomonori, 113 

Tuerk, Christian, 37

U 

Ulrych, Bohus, 321 

V 

Vale, Joao Francisco, 175 

Varga, Gábor, 242 

Vasilescu, George-Marian, 352 

Vegni, Lucio, 25 

Ventre, Salvatore, 208 

Vizireanu, Darius, 78 

Volk, Adrian, 181 

Volkwein, Stefan, 362 

Voracek, Lukas, 321 

Vuckovic, Ana, 300 

Vuckovic, Dragan, 294 

W 

Wakao, Shinji, 288 

Watanabe, Yuta, 276 

Watzenig, Daniel, 67, 368, 403 

Wegleiter, Hannes, 331 

Weiland, Thomas, 204 

Weilharter, Bernhard, 144 

Werth, Tobias, 271 

Wesche, Andrea, 362 

Y 

Yasukawa, Shogo, 288 

Yatchev, Ivan, 59 

Z 

Zagar, Bernhard G., 186 

Zhao, Kezhong, 1 

Župan, Tomislav, 137

CATS Proceedings Printout - Graz University of Technology

Create successful ePaper yourself

Delete template?

Save as template?