single PDF file - Madagascar

MADAGASCAR DOCUMENTATION 

Maurice Aye-Aye, Sergey Fomel, Gilles Hennenfent, and Paul Sava 

http://ahay.org/

Copyright c○ 2011-12 

by Madagascar Community

i 

RSF — TABLE OF CONTENTS 

Maurice the Aye-Aye, Madagascar tutorial: Field data processing . . . . . . . 1 

Paul Sava, Seismic Imaging Tutorial: “exploding reflector” modeling/migration 

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

Maurice the Aye-Aye, Madagascar tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 

Sergey Fomel, Guide to Madagascar programs . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 

Sergey Fomel, Guide to RSF format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 

Sergey Fomel, Revisiting SEP tour with Madagascar and SCons . . . . . . . . . 103 

Sergey Fomel, Guide to RSF API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 

Paul Sava, Guide to programming using RSF . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 

Sergey Fomel and Gilles Hennenfent, Reproducible computational experiments 

using SCons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Madagascar Documentation, RSF, July 19, 2012 

Madagascar tutorial: Field data processing 

Maurice the Aye-Aye 

ABSTRACT 

In this tutorial, you will learn about multiple attenuation using parabolic Radon 

transform (Hampson, 1986). You will first go through an example that explains 

the process step by step. You will be asked to change some parameters and add 

missing few lines. In the next part of the tutorial, you will be asked to apply the 

same workflow to another CMP gather. The CMP gathers used in the tutorial 

are from the Canterbury data set (Lu et al., 2003). By the end of this tutorial, 

you should have learned to: 

1. apply NMO and inverse NMO for a CMP gather, 

2. apply forward and inverse parabloic Radon transform, 

3. design a mute function that preserves multiples in the Radon domain, 

4. subtract multiples from the data, 

5. create a semblance scan for a CMP gather. 

Completing this tutorial requires 

PREREQUISITES 

• Madagascar software environment available from 

http://www.ahay.org 

• L A TEX environment with SEGTeX available from 

http://www.ahay.org/wiki/SEGTeX 

To do the assignment on your personal computer, you need to install the required 

environments. An Internet connection is required for access to the data repository. 

The tutorial itself is available from the Madagascar repository by running 

svn co https://rsf.svn.sourceforge.net/svnroot/rsf/trunk/book/rsf/school2012 

1

2 Maurice Madagascar Documentation 

INTRODUCTION 

In this tutorial, you will be asked to run commands from the Unix shell (identified 

by bash$) and to edit files in a text editor. Different editors are available in a typical 

Unix environment (vi, emacs, nedit, etc.) 

Your first assignment: 

1. Open a Unix shell. 

2. Change directory to the tutorial directory 

bash$ cd $RSFSRC/book/rsf/school2012 

3. Open the tutorial.tex file in your favorite editor, for example by running 

bash$ nedit tutorial.tex & 

4. Look at the first line in the file and change the author name from Maurice the 

Aye-Aye to your name (first things first). 

Part One 

DEMO 

1. Change directory to the demo directory 

bash$ cd demo 

2. Run 

bash$ scons cmp.view 

in the Unix shell. A number of commands will appear in the shell followed by 

Figure 3(a) appearing on your screen. 

3. To understand the commands, examine the script that generated them by opening 

the SConstruct file in a text editor. Notice that, instead of Shell commands, 

the script contains rules. 

• The first rule, Fetch, allows the script to download the input data file 

cmp1.rsf from the data server. 

• Other rules have the form Flow(target,source,command) for generating 

data files or Plot and Result for generating picture files.

Madagascar Documentation Tutorial 3 

• Fetch, Flow, Plot, and Result are defined in Madagascar’s rsf.proj 

package, which extends the functionality of SCons . 

4. To better understand how rules translate into commands, run 

bash$ scons -c cmp.rsf 

The -c flag tells scons to remove the cmp.rsf file and all its dependencies. 

5. Next, run 

bash$ scons -n cmp.rsf 

The -n flag tells scons not to run the command but simply to display it on the 

screen. Identify the lines in the SConstruct file that generate the output you 

see on the screen. 

6. Run 

bash$ scons cmp.rsf 

Examine the file cmp.rsf both by opening it in a text editor and by running 

bash$ sfin cmp.rsf 

Part Two 

Figure 3(a) shows a CMP gather from Canterbury data set Line 12. The multiple 

energy appears at time around 2.25 s. Figure 6 shows the same gather after applying 

NMO correction with veloctiy equals to 1500 m/s. The multiple events starting at 

around 2.25 s and below are flatened while primary events , e.g at 2 s, are over 

corrected. The difference in move-out between the primaries and multiples , hence, 

can be used in Radon domain to attenuate multiple energy. Figure 2(a) is generated by 

forward parabolic Radon transform while Figure 1(d) is generated by inverse parabloic 

Radon transform. The purpose was to make sure that forward and inverse transforms 

do not cause any data loss. 

Figure 2(a) shows the Radon transform of the CMP gather in Figure 3(a) while 

Figure 2(b) shows in the Radon domain the multiple energy only after mutting the 

primary energy. The protected multiples can be taken back to the time-offset domain 

and are subtracted from the data. 

CMP gather before multiple attenuation is shown in Figure 3(a) and the coresponding 

semblance scan is shown in Figure 3(c). The CMP gather after multiple 

attenuation is shown in Figure 3(b) and the coresponding semblance scan is shown in 

Figure 3(d). The semblance scans show how multiple energy is reduced for the CMP 

gather after multiple attenuation.


Figure 1: CMP gather from Canterbury dataset before applying NMO (a), after applying 

NMO (b), after Forward parabolic Radon transfrom (c), after applying inverse 

parabolic Radon transform (d). The forward and inverse parabolic Radon transforms 

are applied in sequence to examine the parameters of the process and to ensure that 

no events are lost during the process school2012/demo cmp,nmo,taup,nmo2


Figure 2: Forward Radon transform of the gather (a). Mute is applied to preserve 

multiples (b); so that multiples can be transformed to time-offset domain for subtraction 

from the CMP gather. school2012/demo taup,taupmult 

1. To examine the forward and inverse Radon transform, Run 

bash$ scons taup_qc.view 

2. Edit the SConstruct file to modify the reference offset x0 for sfradon program. 

To get more details about sfradon parameters, run 

bash$ sfradon 

in a Unix shell. Check your result by running 

scons taup_qc.view 

3. Edit the SConstruct file to modify the starting time t0 for sfmutter. To get 

more details about sfmutter parameters, run sfmutter in a Unix shell. Check 

your result by running 

scons taup_mult.view 

4. Edit the SConstruct file to modify the starting time v0 for sfmutter. Check 

your result by running


Figure 3: CMP gather before multiple attenuation (a). CMP gather after 

multiple attenuation (b). Gather in (a) is used to generated semblance 

scan in (c). Gather in (b) is used to generate semblance scan in (d). 

school2012/demo cmp,signal2,vscan-cmp,vscan-signal2


scons taup_mult.view 

5. Edit the SConstruct file and find the line that says ADD CODE to create 

signal2.vpl. To get more details about sfgrey parameters, run sfgrey in a 

Unix shell. Add your code and create the vpl file by running 

scons signal2.vpl 

6. Edit the SConstruct file and find the line that says ADD CODE to display 

cmp.vpl and signal2.vpl. Add your code and view the file by running 

scons cmp_signal2.view 

7. Edit the SConstruct file and find the line that says ADD CODE to display 

vscan-cmp.vpl and vscan-signal2.vpl. Add your code and view the file by 

running 

scons v_cmp_signal2.view 

1 from r s f . p r o j import ∗ 

2 

3 # download cmp1 . r s f from the s e r v e r 

4 Fetch ( ’cmp1 . r s f ’ , ’ cant12 ’ ) 

5 

6 # convert to n a t i v e format 

7 Flow ( ’cmp ’ , ’cmp1 ’ , ’ dd form=n a t i v e ’ ) 

8 

9 # c r e a t e cmp . v p l f i l e 

10 Plot ( ’cmp ’ , ’ grey t i t l e=CMP ’ ) 

11 

12 # water v e l o c i t y 1500 m/ s 

13 wvel=1500 

14 

15 # NMO with water v e l o c i t y 

16 Flow ( ’nmo ’ , ’cmp ’ , ’ nmostretch h a l f=n v0=%g ’%wvel ) 

17 

18 # c r e a t e nmo. v p l 

19 Plot ( ’nmo ’ , ’ grey t i t l e=NMO’ ) 

20 

21 # c r e a t e cmp nmo . v p l f i l e under Fig d i r e c t o r y 

22 # cmp . v p l and nmo. v p l c r e a t e d e a r l i e r using Plot 

23 # command w i l l be p l o t e d s i d e by s i d e 

24 Result ( ’cmp nmo ’ , ’cmp nmo ’ , ’ SideBySideAniso ’ ) 

25 

26 ####################


27 # radon parameters 

28 #################### 

29 ox =29.25 

30 nx=60 

31 dx=25 

32 #−−−−−−−−−−−−−−−−−−−−− 

33 x0=800 # CHANGE ME 

34 #−−−−−−−−−−−−−−−−−−−−− 

35 p0=−.05 

36 dp=.0005 

37 np=201 

38 

39 # forward Radon operator 

40 radono=’ ’ ’ 

41 radon np=%d p0=%f dp=%f x0=%d parab=y 

42 ’ ’ ’ %(np , p0 , dp , x0 ) 

43 

44 # i n v e r s e Radon operator 

45 radonoinv=’ ’ ’ 

46 radon adj=n nx=%d ox=%g dx=%d x0=%d parab=y 

47 ’ ’ ’ %(nx , ox , dx , x0 ) 

48 

49 # Test radon parameters , apply forward and 

50 # i n v e r s e Radon Transform , and QC r e s u l t s 

51 ######################################### 

52 Flow ( ’ taup ’ , ’nmo ’ , radono ) 

53 

54 # p l o t 

55 Plot ( ’ taup ’ , ’ grey t i t l e=forward RT ’ ) 

56 

57 # I n v e r s e 

58 Flow ( ’nmo2 ’ , ’ taup ’ , radonoinv ) 

59 

60 # p l o t 

61 Plot ( ’nmo2 ’ , ’ grey t i t l e=i n v e r s e RT ’ ) 

62 

63 # Display t h r e e f i g u r e s to QC Radon parameters 

64 # Check t h a t forward and i n v e r s e Radon transforms 

65 # do not change the data i . e e v e n t s are p r e s e r v e d . 

66 

67 Result ( ’ taup qc ’ , ’nmo taup nmo2 ’ , ’ SideBySideAniso ’ ) 

68 

69 ###################################### 

70 # design a mute f u n c t i o n t h a t p r o t e c t s 

71 # m u l t i p l e s in the Radon domain


72 ###################################### 

73 #−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

74 t0 =1.2 # CHANGE ME ; t r y 1.5 

75 #−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

76 # v e r t i c a l p o s i t i o n o f the t r i a n g l e v e r t i x 

77 

78 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

79 v0=.03 # CHANGE ME ; t r y .015 

80 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

81 # s l o p e o f the t r i a n g l e 

82 

83 Flow ( ’ taupmult ’ , ’ taup ’ , ’ mutter t0=%g v0=%g ’%(t0 , v0 ) ) 

84 Plot ( ’ taupmult ’ , ’ grey t i t l e =”m u l t i p l e s in Radon domain” ’ ) 

85 

86 # Display taup . v p l and taupmult . v p l 

87 # This d i s p l a y a l l o w s a f l i p between 

88 # the two f i g u r e s 

89 Result ( ’ taup mult ’ , ’ taup taupmult ’ , 

90 ’ ’ ’ 

91 cat a x i s=3 ${SOURCES[ 1 ] } 

92 | grey 

93 ’ ’ ’ ) 

94 

95 # Transform m u l i t p l e s from Radon domain to time−o f f s e t domain 

96 Flow ( ’ m u l t i p l e ’ , ’ taupmult ’ , radonoinv ) 

97 

98 # c r e a t e m u l t i p l e . v p l 

99 Plot ( ’ m u l t i p l e ’ , ’ grey t i t l e =”m u l t i p l e s ” ’ ) 

100 

101 # p l o t CMP and m u l t i p l e s s i d e by s i d e 

102 Result ( ’ cmp mult ’ , ’nmo2 m u l t i p l e ’ , ’ SideBySideAniso ’ ) 

103 

104 # S u b t r a c t m u l t i p l e s from the CMP 

105 Flow ( ’ s i g n a l ’ , ’ m u l t i p l e nmo2 ’ , 

106 ’ ’ ’ 

107 add s c a l e =−1,1 ${SOURCES[ 1 ] } 

108 ’ ’ ’ ) 

109 

110 # i n v e r s e NMO 

111 Flow ( ’ s i g n a l 2 ’ , ’ s i g n a l ’ , 

112 ’ ’ ’ 

113 nmostretch inv=y h a l f=n v0=%g 

114 | mutter v0=1900 x0=200 

115 ’ ’ ’%wvel ) 

116


117 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

118 # ADD CODE to c r e a t e s i g n a l 2 . v p l 

119 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

120 

121 

122 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

123 # ADD CODE to d i s p l a y cmp . v p l and s i g n a l 2 . vpl , 

124 # make the f i g u r e s f l i p back and f o r t h so you 

125 # can examine the the r e s u l t s of m u l t i p l e 

126 # a t t e n u a t i o n . Let us c a l l the output f i l e 

127 # cmp signal2 

128 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

129 

130 

131 #################### 

132 # Semblance Scan 

133 #################### 

134 dv=10 

135 nv=251 

136 v0=1400 

137 vscan=’ vscan v0=%d dv=%d nv=%d semblance=y h a l f=n ’%(v0 , dv , nv ) 

138 pick=’ pick r e c t 1 =150 r e c t 2 =50 gate=20 ’ 

139 

140 # semblance scan 

141 Flow ( ’ vscan−cmp ’ , ’cmp ’ , vscan ) 

142 

143 # semblance scan 

144 Flow ( ’ vscan−s i g n a l 2 ’ , ’ s i g n a l 2 ’ , vscan ) 

145 

146 Plot ( ’ vscan−cmp ’ , 

147 ’ ’ ’ 

148 grey c o l o r=j a l l p o s=y 

149 t i t l e =”V e l o c i t y Scan − CMP” 

150 ’ ’ ’ ) 

151 

152 Plot ( ’ vscan−s i g n a l 2 ’ , 

153 ’ ’ ’ 

154 grey c o l o r=j a l l p o s=y 

155 t i t l e =”V e l o c i t y Scan − a f t e r demultiple ” 

156 ’ ’ ’ ) 

157 

158 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

159 # ADD CODE to d i s p l a y the two f i g u r e s 

160 # vscan−cmp . v p l and vscan−s i g n a l 2 . v p l 

161 # s i d e by s i d e . Let us c a l l the output


162 # f i l e vcmp−s i g n a l 2 

163 #−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

164 

165 

166 

167 ################################################### 

168 # This part i s to c r e a t e f i g u r e s f o r t u t o r i a l . pdf 

169 ################################################### 

170 # d e f i n e grey commands f o r f i g u r e s to be i n c l u d e d 

171 # in t u t o r i a l . pdf 

172 grey=’ ’ ’ 

173 grey w a n t t i t l e=n l a b e l f a t =2 t i t l e f a t =2 

174 x l l =2 y l l =1.5 yur=9 xur=6 

175 ’ ’ ’ 

176 

177 greyc=’ ’ ’ 

178 grey w a n t t i t l e=n l a b e l f a t =2 t i t l e f a t =2 

179 x l l =2 y l l =1.5 yur=9 xur=6 

180 c o l o r=j a l l p o s=y 

181 ’ ’ ’ 

182 # c r e a t e p l o t s 

183 Result ( ’cmp ’ , grey ) 

184 Result ( ’nmo ’ , grey ) 

185 Result ( ’ taup ’ , grey ) 

186 Result ( ’nmo2 ’ , grey ) 

187 Result ( ’ taupmult ’ , grey ) 

188 Result ( ’ s i g n a l 2 ’ , grey ) 

189 Result ( ’ vscan−cmp ’ , greyc ) 

190 Result ( ’ vscan−s i g n a l 2 ’ , greyc ) 

191 

192 End ( ) 

EXERCISE 

In this part, your task is to apply the workflow explained above to a different CMP 

gather that requires different parameters. The same workflow should work here, but 

you need to observe that the CMP gather used for this exercise has shallow events. 

This means that, after applying NMO correction, amplitudes at far offstes of the 

shallow events get stretched. Therefore, an additional step is required for this CMP. 

We need to mute the distorted amplitudes. The mute is already applied in the 

SConstruct. 

1. Display the CMP gather after NMO with and without mute applied by running


scons nmo1_nmo.view 

2. Your task is to add the necessary code to attenuate multiples for this CMP. 

The same work flow used in the SConstruct under demo directory should work 

here with only changes to 

• x0 

• t0 

• v0 

where it says CHANGE ME in the comments 

WRITING A REPORT 

1. Change directory to the parent directory 

bash$ cd .. 

This should be the directory that contains tutorial.tex. 

2. Run 

bash$ sftour scons lock 

The sftour command visits all subdirectories and runs scons lock, which 

copies result files to a different location so that they do not get modified until 

further notice. 

3. You can also run 

bash$ sftour scons -c 

to clean intermediate results. 

4. Edit the file paper.tex to include your additional results. If you have not used 

L A TEX before, no worries. It is a descriptive language. Study the file, and it 

should become evident by example how to include figures. 

5. Run 

bash$ scons tutorial.pdf 

and open tutorial.pdf with a PDF viewing program such as Acrobat Reader. 

6. If you have L A TEX2HTML installed, you can also generate an HTML version of 

your paper by running 

bash$ scons tutorial.html 

and opening tutorial_html/index.html in a web browser.


REFERENCES 

Hampson, D., 1986, Inverse velocity stacking for multiple elimination: J. Can. Soc. 

Expl. Geophys, 22, 44–55. 

Lu, H., C. S. Fulthorpe, and P. Mann, 2003, Three-dimensional architecture of shelfbuilding 

sediment drifts in the offshore canterbury basin, new zealand: Marine 

Geology, 193, 19 – 47.

14 Maurice Madagascar Documentation


Seismic Imaging Tutorial: 

“exploding reflector” modeling/migration 

Paul Sava 

Center for Wave Phenomena 

Colorado School of Mines 1 

ABSTRACT 

This document demonstrates how reproducible numeric experiments constructed 

using the madagascar software package can be integrated into a document 

generated using the L A TEXtypesetting program. I use a simple modeling/migration 

exercise based on the exploding reflector model to illustrate reproducible 

document generation. 

INTRODUCTION 

Acoustic modeling and migration can be implemented using numeric solutions to the 

acoustic wave-equation (Clærbout, 1985): 

( ) 

1 

1 

v Ẅ − ρ∇ · 2 ρ ∇W = f . (1) 

In Equation 1, W (x, t) represents the acoustic wavefield, v (x) and ρ (x) represent 

the velocity and density of the medium, respectively, and f (x, t) represents a source 

function. 

• In modeling, we use the distributed source f (x, t) to generate the wavefield 

W (x, t) at all positions and all times by wave propagation forward in time. 

The data represent a subset of the wavefield observed at receivers distributed 

in the medium: D (r, t) = W (x = r, t). 

• In migration, we use the observed data D (r, t) to generate the wavefield W (x, t) 

at all positions and all times by wave propagation backward in time. The image 

represents a subset of the wavefield at time zero: R (x) = W (x =, t = 0). 

In both cases, we solve Equation 1 with different initial conditions, but with the same 

model, v (x) and ρ (x) and with the same boundary conditions. 

1 e-mail: psava@mines.edu 

15

16 Sava Madagascar Documentation 

EXAMPLE 

I illustrate the method using the Sigsbee 2A synthetic model. This model is based on 

the Sigsbee structure in the Gulf of Mexico and the velocity model is illustrated in 

Figure 1. The model is characterized by a massive salt body close to the water bottom 

and surrounded by sediments. The salt velocity is 4.5 km/s and the surrounding 

sediment velocity ranges from approximately 1.5 to 3.25 km/s. 

Figure 1: Stratigraphic 

Sigsbee 2A velocity model 

school/sigsbee vstr 

In this experiment, I consider sources distributed uniformly in the subsalt region 

of the model. The data are acquired in a borehole array, located at x = 8.5 km and 

a horizontal array located at z = 1.5 km. In order to avoid multiple scattering in the 

subsurface, I simulate waves in a smooth version of the Sigsbee model, illustrated in 

Figure 2, and constant density. 

Figure 2: Smooth Sigsbee 

2A velocity model 

school/sigsbee vsmo 

Using the madagascar program sfawefd2d, we can simulate wavefields from the 

distributed sources. Figures 3(a)-3(h) show wavefield snapshots in order of increasing 

times. We can observe waves propagating from all subsalt sources, interacting with 

the variable velocity medium and arriving at the vertical and horizontal arrays. 

Figures 4(a) and 4(b) show the data observed at the horizontal array in variable 

density and wiggle plotting formats, respectively. Similarly, Figures 5(a) and 5(b) 

show the data observed in the vertical array using the same plotting formats. The 

data are just subsets of the same wavefields at the respective receiver positions and 

capture the complications observed in the wavefield, i.e. triplications due to lateral 

velocity variation.

Madagascar Documentation WSI tutorial 17 

Figure 3: Wavefield snapshots at increasing times. 

school/sigsbee wfld-01,wfld-03,wfld-05,wfld-07,wfld-09,wfld-11,wfld-13,wfld-15


Figure 4: Acoustic data observed in the horizontal array. school/sigsbee datH,wigH


Figure 5: Acoustic data observed in the vertical array. school/sigsbee datV,wigV


In zero-offset migration, we use the observed data to backprogate the wavefields 

using the data as boundary conditions. The image is the wavefield at time zero. 

Since we can acquire data at different locations in space, the reconstructed wavefields 

depend on the acquisition geometry, thus limiting the illumination in the subsurface. 

Therefore, the migrated images depend on the acquisition array, as illustrated in 

Figures 6(a) and 6(b) for the horizontal and vertical arrays, respectively. We can 

also obtain images by migrating the data observed in both the horizontal and vertical 

arrays, as illustrated in Figure 7, thus increasing the acquisition aperture and the 

subsurface illumination. 

CONCLUSIONS 

The combination of L A TEX and madagascar allows geoscientists to generate reproducible 

documents where the numeric examples can be verified by any user with the 

same computer setup. This allows for transparent peer-review, for recursive development 

and for transfer of technology between collaborative research groups. 

ACKNOWLEDGMENTS 

The reproducible numeric examples in this paper use the madagascar open-source 

software package freely available from http://www.reproducibility.org. 

REFERENCES 

Clærbout, J. F., 1985, Imaging the Earth’s interior: Blackwell Scientific Publications.


Figure 6: Migrated images for data acquired in (a) the horizontal array and (b) the 

vertical array. school/sigsbee imgH,imgV


Figure 7: Migrated image for data acquired in the horizontal and the vertical arrays. 

school/sigsbee imgA


Madagascar tutorial 

Maurice the Aye-Aye 1 

ABSTRACT 

In this tutorial, you will go through different steps required for writing a research 

paper with reproducible examples. In particular, you will 

1. identify a research problem, 

2. suggest a solution, 

3. test your solution using a synthetic example, 

4. apply your solution to field data, 

5. write a report about your work. 

Completing this tutorial requires 

PREREQUISITES 

• Madagascar software environment available from 

http://www.ahay.org 

• L A TEX environment with SEGTeX available from 

http://www.ahay.org/wiki/SEGTeX 

To do the assignment on your personal computer, you need to install the required 

environments. An Internet connection is required for access to the data repository. 

The tutorial itself is available from the Madagascar repository by running 

svn co https://rsf.svn.sourceforge.net/svnroot/rsf/trunk/book/rsf/school2009 

INTRODUCTION 

In this tutorial, you will be asked to run commands from the Unix shell (identified 

by bash$) and to edit files in a text editor. Different editors are available in a typical 

Unix environment (vi, emacs, nedit, etc.) 

1 e-mail: psava@mines.edu 

23


Your first assignment: 

1. Open a Unix shell. 

2. Change directory to the tutorial directory 

bash$ cd $RSFSRC/book/rsf/school2009 

3. Open the paper.tex file in your favorite editor, for example by running 

bash$ nedit paper.tex & 

4. Look at the first line in the file and change the author name from Maurice the 

Aye-Aye to your name (first things first). 

PROBLEM 

Figure 1: Depth slice from 3-D seismic (left) and output of edge detection (right). 

school2009/channel horizon 

The left plot in Figure 1 shows a depth slice from a 3-D seismic volume 2 . You 

notice a channel structure and decide to extract it using and edge detection algorithm 

from the image processing literature (Canny, 1986). In a nutshell, Canny’s edge 

detector picks areas of high gradient that seem to be aligned along an edge. The 

extracted edges are shown in the right plot of Figure 1. The initial result is not too 

clear, because it is affected by random fluctuations in seismic amplitudes. The goal 

of your research project is to achieve a better result in automatic channel extraction. 

1. Change directory to the project directory 

2 Courtesy of Matt Hall (ConocoPhillips Canada Ltd.)


bash$ cd channel 

2. Run 

bash$ scons horizon.view 

in the Unix shell. A number of commands will appear in the shell followed by 

Figure 1 appearing on your screen. 

3. To understand the commands, examine the script that generated them by opening 

the SConstruct file in a text editor. Notice that, instead of Shell commands, 

the script contains rules. 

• The first rule, Fetch, allows the script to download the input data file 

horizon.asc from the data server. 

• Other rules have the form Flow(target,source,command) for generating 

data files or Plot and Result for generating picture files. 

• Fetch, Flow, Plot, and Result are defined in Madagascar’s rsf.proj 

package, which extends the functionality of SCons (Fomel and Hennenfent, 

2007). 

4. To better understand how rules translate into commands, run 

bash$ scons -c horizon.rsf 

The -c flag tells scons to remove the horizon.rsf file and all its dependencies. 

5. Next, run 

bash$ scons -n horizon.rsf 

The -n flag tells scons not to run the command but simply to display it on the 

screen. Identify the lines in the SConstruct file that generate the output you 

see on the screen. 

6. Run 

bash$ scons horizon.rsf 

Examine the file horizon.rsf both by opening it in a text editor and by running 

bash$ sfin horizon.rsf 

How many different Madagascar modules were used to create this file? What 

are the file dimensions? Where is the actual data stored?


7. Run 

bash$ scons smoothed.rsf 

Notice that the horizon.rsf file is not being rebuilt. 

8. What does the sfsmooth module do? Find it out by running 

bash$ sfsmooth 

without arguments. Has sfsmooth been used in any other Madagascar examples? 

9. What other Madagascar modules perform smoothing? To find out, run 

bash$ sfdoc -k smooth 

10. Notice that Figure 1 does not make a very good use of the color scale. To 

improve the scale, find the mean value of the data by running 

bash$ sfattr < horizon.rsf 

and insert it as a new value for the bias= parameter in the SConstruct file. 

Does smoothing by sfsmooth change the mean value? 

11. Save the SConstruct file and run 

bash$ scons view 

to view improved images. Notice that horizon.rsf and smoothed.rsf files are 

not being rebuilt. SCons is smart enough to know that only the part affected 

by your changes needs to be updated. 

As shown in Figure 2, smoothing removes random amplitude fluctuations but at 

the same broadens the channel and thus makes the channel edge detection unreliable. 

In the next part of this tutorial, you will try to find a better solution by examining 

a simple one-dimensional synthetic example. 


2 

3 # Download data 

4 Fetch ( ’ horizon . asc ’ , ’ h a l l ’ ) 

5 

6 # Convert format 

7 Flow ( ’ horizon ’ , ’ horizon . asc ’ , 

8 ’ ’ ’


Figure 2: Depth slice from Figure 1 after smoothing (left) and output of edge detection 

(right). school2009/channel smoothed 

9 echo in=$SOURCE data format=a s c i i f l o a t n1=3 n2=57036 | 

10 dd form=n a t i v e | window n1=1 f 1=−1 | 

11 put 

12 n1=196 o1 =33.139 d1=0.01 l a b e l 1=y unit1=km 

13 n2=291 o2 =35.031 d2=0.01 l a b e l 2=x unit2=km 

14 ’ ’ ’ ) 

15 

16 # Triangle smoothing 

17 Flow ( ’ smoothed ’ , ’ horizon ’ , ’ smooth r e c t 1 =20 r e c t 2 =20 ’ ) 

18 

19 # Display r e s u l t s 

20 for horizon in ( ’ horizon ’ , ’ smoothed ’ ) : 

21 # −−− CHANGE BELOW −−− 

22 Plot ( horizon , ’ grey c o l o r=j b i a s=0 y r e v e r s e=n w a n t t i t l e=n ’ ) 

23 edge = ’ edge−’+horizon 

24 Flow ( edge , horizon , ’ canny max=98 | dd type=f l o a t ’ ) 

25 Plot ( edge , ’ grey a l l p o s=y y r e v e r s e=n w a n t t i t l e=n ’ ) 

26 Result ( horizon , [ horizon , edge ] , ’ SideBySideIso ’ ) 

27 

28 End ( ) 

1-D SYNTHETIC 

To better understand the effect of smoothing, you decide to create a one-dimensional 

synthetic example shown in Figure 3(a). The synthetic contains both sharp edges and 

random noise. The output of conventional triangle smoothing is shown in Figure 3(b). 

We see an effect similar to the one in the real data example: random noise gets


Figure 3: (a) 1-D synthetic to test edge-preserving smoothing. (b) Output of conventional 

triangle smoothing. school2009/local step,smooth 

removed by smoothing at the expense of blurring the edges. Can you do better? 

Figure 4: (a) Input synthetic trace duplicated multiple times. (b) Duplicated traces 

shifted so that each data sample gets surrounded by its neighbors. The original trace 

is in the middle. school2009/local spray,local 

To better understand what is happening in the process of smoothing, let us convert 

1-D signal into a 2-D signal by first replicating the trace several times and then shifting 

the replicated traces with respect to the original trace (Figure 4). This creates a 2- 

D dataset, where each sample on the original trace is surrounded by samples from 

neighboring traces. 

Every local filtering operation can be understood as stacking traces from Figure 

4(b) multiplied by weights that correspond to the filter coefficients. 


bash$ cd ../local


2. Verify the claim above by running 

bash$ scons smooth.view smooth2.view 

Open the SConstruct file in a text editor to verify that the first image is 

computed by sfsmooth and the second image is computed by applying triangle 

weights and stacking. To compare the two images by flipping between them, 

run 

bash$ sfpen Fig/smooth.vpl Fig/smooth2.vpl 

3. Edit SConstruct to change the weight from triangle 

to Gaussian 

W T (x) = 1 − |x| 

x 0 

(1) 

) 

W G (x) = exp 

(−α |x|2 

Repeat the previous computation. Does the result change? What is a good 

value for α? 

4. Thinking about this problem, you invent an idea 3 . Why not apply non-linear 

filter weights that would discriminate between points not only based on their 

distance from the center point but also on the difference in function values 

between the points. That is, instead of filtering by 

∫ 

g(x) = f(y) W (x − y) dy , (3) 

where f(x) is input, g(y) is output, and W (x) is a linear weight, you decide to 

filter by 

∫ 

g(x) = f(y) Ŵ (x − y, f(x) − f(y)) dy , (4) 

where and Ŵ (x, z) is a non-linear weight. Compare the two weights by running 

bash$ scons triangle.view similarity.view 

The results should look similar to Figure 5. 

5. The final output is Figure 6. By examining SConstruct, find how to reproduce 

this figure. 

6. EXTRA CREDIT If you are familiar with programming in C, add 1-D nonlocal 

filtering as a new Madagascar module sfnonloc. Ask the instructor 

for further instructions. 

x 2 0 

(2)


Figure 5: (a) Linear and stationary triangle weights. (b) Non-linear and nonstationary 

weights reflecting both distance between data points and similarity in 

data values. school2009/local triangle,similarity 

Figure 6: Output of 

non-local 

smoothing 

school2009/local nlsmooth


Figure 6 shows that non-linear filtering can eliminate random noise while preserving 

the edges. The problem is solved! Now let us apply the result to our original 

problem. 

1 /∗ Non−l o c a l smoothing . ∗/ 

2 #include 

3 

4 int main ( int argc , char ∗ argv [ ] ) 

5 { 

6 int n1 , n2 , i1 , i2 , i s , ns ; 

7 float ∗ trace , ∗ trace2 , ax , ay , t ; 

8 s f f i l e inp , out ; 

9 

10 /∗ i n i t i a l i z e ∗/ 

11 s f i n i t ( argc , argv ) ; 

12 

13 /∗ s e t input and output f i l e s ∗/ 

14 inp = s f i n p u t ( ” in ” ) ; 

15 out = s f o u t p u t ( ” out ” ) ; 

16 

17 /∗ g e t input dimensions ∗/ 

18 i f ( ! s f h i s t i n t ( inp , ”n1”,&n1 ) ) 

19 s f e r r o r ( ”No n1= in input ” ) ; 

20 n2 = s f l e f t s i z e ( inp , 1 ) ; 

21 

22 /∗ g e t command−l i n e parameters ∗/ 

23 i f ( ! s f g e t i n t ( ” ns ”,&ns ) ) s f e r r o r ( ”Need ns=” ) ; 

24 /∗ spray r a d i u s ∗/ 

25 

26 i f ( ! s f g e t f l o a t ( ”ax”,&ax ) ) s f e r r o r ( ”Need ax=” ) ; 

27 /∗ e x p o n e n t i a l weight f o r the c o o r d i n a t e d i s t a n c e ∗/ 

28 

29 t r a c e = s f f l o a t a l l o c ( n1 ) ; 

30 t r a c e 2 = s f f l o a t a l l o c ( n1 ) ; 

31 

32 /∗ loop over t r a c e s ∗/ 

33 for ( i 2 =0; i 2 < n2 ; i 2++) { 

34 /∗ read input ∗/ 

35 s f f l o a t r e a d ( trace , n1 , inp ) ; 

36 

37 /∗ loop over samples ∗/ 

38 for ( i 1 =0; i 1 < n1 ; i 1++) { 

39 t = 0 . ; 

3 Actually, you reinvent the idea of bilateral or non-local filters (Tomasi and Manduchi, 1998; 

Gilboa and Osher, 2008).


40 

41 /∗ accumulate s h i f t s ∗/ 

42 for ( i s=−ns ; i s = 0 && i 1+i s < n1 ) { 

44 

45 /∗ ! ! ! MODIFY THE NEXT LINE ! ! ! ∗/ 

46 t += t r a c e [ i 1+i s ] ∗ expf(−ax∗ i s ∗ i s ) ; 

47 } 

48 } 

49 

50 t r a c e 2 [ i 1 ] = t ; 

51 } 

52 

53 /∗ w r i t e output ∗/ 

54 s f f l o a t w r i t e ( trace2 , n1 , out ) ; 

55 } 

56 

57 /∗ clean up ∗/ 

58 s f f i l e c l o s e ( inp ) ; 

59 e x i t ( 0 ) ; 

60 } 

SOLUTION 


bash$ cd ../channel2 

2. By now, you should know what to do next. 

3. Two-dimensional shifts generate a four-dimensional volume. Verify it by running 

bash$ scons local.rsf 

and 

bash$ sfin local.rsf 

View a movie of different shifts by running 

bash$ scons local.vpl 

4. Modify the filter weights by editing SConstruct in a text editor. Observe your 

final result by running


bash$ scons smoothed2.view 

5. The file norm.rsf contains the non-linear weights stacked over different shifts. 

Add a Result statement to SConstruct that would display the contents of 

norm.rsf in a figure. Do you notice anything interesting? 

6. Apply the Canny edge detection to your final result and display it in a figure. 

7. EXTRA CREDIT Change directory to ../mona and apply your method to 

the image of Mona Lisa. Can you extract her smile? 


2 


4 Fetch ( ’ horizon . asc ’ , ’ h a l l ’ ) 

5 

6 # Convert format 

7 Flow ( ’ horizon2 ’ , ’ horizon . asc ’ , 

8 ’ ’ ’ 

9 echo in=$SOURCE data format=a s c i i f l o a t n1=3 n2=57036 | 

10 dd form=n a t i v e | window n1=1 f 1=−1 | 

11 add add=−65 | put 

12 n1=196 o1 =33.139 d1=0.01 l a b e l 1=y unit1=km 

13 n2=291 o2 =35.031 d2=0.01 l a b e l 2=x unit2=km 

14 ’ ’ ’ , s t d i n =0) 

15 Result ( ’ horizon2 ’ , ’ grey y r e v e r s e=n c o l o r=j t i t l e=Input ’ ) 

16 

17 # Spray 

18 Flow ( ’ spray ’ , ’ horizon2 ’ , 

19 ’ ’ ’ 

20 spray a x i s=3 n=21 o=−0.1 d=0.01 | 

21 spray a x i s=4 n=21 o=−0.1 d=0.01 

22 ’ ’ ’ ) 

23 

24 # S h i f t 

25 Flow ( ’ s h i f t 1 ’ , ’ spray ’ , ’ window n1=1 | math output=x2 ’ ) 

26 Flow ( ’ s h i f t 2 ’ , ’ spray ’ , ’ window n2=1 | math output=x3 ’ ) 

27 

28 Flow ( ’ l o c a l ’ , ’ spray s h i f t 1 s h i f t 2 ’ , 

29 ’ ’ ’ 

30 d a t s t r e t c h datum=${SOURCES[ 1 ] } | transp | 

31 d a t s t r e t c h datum=${SOURCES[ 2 ] } | transp 

32 ’ ’ ’ ) 

33 Plot ( ’ l o c a l ’ , ’ window j 3=4 j 4=4 | grey c o l o r=j ’ , view=1) 

34


35 # −−− CHANGE BELOW −−− 

36 # t r y ” exp ( −0.1∗( input−l o c )ˆ2 −200∗( x3ˆ2+x4 ˆ2))” 

37 Flow ( ’ s i m i l ’ , ’ spray l o c a l ’ , 

38 ’ ’ ’ 

39 math l o c=${SOURCES[ 1 ] } output=1 

40 ’ ’ ’ ) 

41 

42 Flow ( ’ norm ’ , ’ s i m i l ’ , 

43 ’ stack a x i s=4 | stack a x i s=3 ’ ) 

44 

45 Flow ( ’ smoothed2 ’ , ’ l o c a l s i m i l norm ’ , 

46 ’ ’ ’ 

47 add mode=p ${SOURCES[ 1 ] } | 

48 stack a x i s=4 | stack a x i s=3 | 

49 add mode=d ${SOURCES[ 2 ] } 

50 ’ ’ ’ ) 

51 Result ( ’ smoothed2 ’ , ’ grey y r e v e r s e=n c o l o r=j t i t l e=Output ’ ) 

52 

53 End ( ) 

Figure 7: Your final result. 

school2009/channel2 smoothed2 


2 


4 Fetch ( ’mona . img ’ , ’ imgs ’ ) 

5 

6 # Convert to standard format 

7 Flow ( ’mona ’ , ’mona . img ’ , 

8 ’ ’ ’ 

9 echo n1=512 n2=513 in=$SOURCE data format=n a t i v e u c h a r | 

10 dd type=f l o a t 

11 ’ ’ ’ , s t d i n =0) 

12


Figure 8: Can you apply 

your algorithm to Mona Lisa? 

school2009/mona mona 

13 Result ( ’mona ’ , 

14 ’ ’ ’ 

15 grey transp=n a l l p o s=y t i t l e =”Mona Lisa ” 

16 c o l o r=b s c r e e n r a t i o =1 wantaxis=n 

17 ’ ’ ’ ) 

18 

19 End ( ) 

WRITING A REPORT 

1. Change directory to the parent directory 

bash$ cd .. 

This should be the directory that contains paper.tex. 

2. Run 

bash$ sftour scons lock 

The sftour command visits all subdirectories and runs scons lock, which 

copies result files to a different location so that they do not get modified until 

further notice. 

3. You can also run 

bash$ sftour scons -c


to clean intermediate results. 

4. Edit the file paper.tex to include your additional results. If you have not used 

L A TEX before, no worries. It is a descriptive language. Study the file, and it 

should become evident by example how to include figures. 

5. Run 

bash$ scons paper.pdf 

and open paper.pdf with a PDF viewing program such as Acrobat Reader. 

6. Want to submit your paper to Geophysics? Edit SConstruct in the paper 

directory to add option=manuscript to the End statement. Then run 

bash$ scons paper.pdf 

again and display the result. 

7. If you have L A TEX2HTML installed, you can also generate an HTML version of 

your paper by running 

bash$ scons html 

and opening paper_html/index.html in a web browser. 

REFERENCES 

Canny, J., 1986, A computational approach to edge detection: IEEE Trans. Pattern 

Analysis and Machine Intelligence, 8, 679–714. 

Fomel, S., and G. Hennenfent, 2007, Reproducible computational experiments using 

SCons: 32nd International Conference on Acoustics, Speech, and Signal Processing 

(ICASSP), IEEE, 1257–1260. 

Gilboa, G., and S. Osher, 2008, Nonlocal operators with applications to image processing: 

Multiscale Model & Simulation, 7, 1005–1028. 

Tomasi, C., and R. Manduchi, 1998, Bilateral filtering for gray and color images: 

Proceedings of IEEE International Conference on Computer Vision, IEEE, 836– 

846.


Guide to Madagascar programs 

Sergey Fomel 1 

ABSTRACT 

This guide introduces some of the most used madagascar programs and illustrates 

their usage with examples. 

MAIN PROGRAMS 

The source files for these programs can be found under system/main in the Madagascar 

distribution. 

1 e-mail: sergey.fomel@beg.utexas.edu 

37

38 Fomel Madagascar Documentation 

sfadd: Add, multiply, or divide RSF datasets. 

sfadd > out.rsf scale= add= sqrt= abs= log= exp= mode= [< file0.rsf] file1.rsf 

file2.rsf ... 

The various operations, if selected, occur in the following order: 

(1) Take absolute value, abs= 

(2) Add a scalar, add= 

(3) Take the natural logarithm, log= 

(4) Take the square root, sqrt= 

(5) Multiply by a scalar, scale= 

(6) Compute the base-e exponential, exp= 

(7) Add, multiply, or divide the data sets, mode= 

sfadd operates on integer, float, or complex data, but all the input 

and output files must be of the same data type. 

An alternative to sfadd is sfmath, which is more versatile, but may be 

less efficient. 

bools abs= If true take absolute value [nin] 

floats add= Scalar values to add to each dataset [nin] 

bools exp= If true compute exponential [nin] 

bools log= If true take logarithm [nin] 

string mode= ’a’ means add (default), ’p’ or ’m’ means 

multiply, ’d’ means divide 

floats scale= Scalar values to multiply each dataset 

with [nin] 

bools sqrt= If true take square root [nin] 

sfadd is useful for combining (adding, dividing, or multiplying) several datasets. 

What if you want to subtract two datasets? Easy. Use the scale parameter as 

follows: 

bash$ sfadd data1.rsf data2.rsf scale=1,-1 > diff.rsf 

or 

bash$ sfadd < data1.rsf data2.rsf scale=1,-1 > diff.rsf 

The same task can be accomplished with the more general sfmath program: 

bash$ sfmath one=data1.rsf two=data2.rsf output=’one-two’ > diff.rsf 

or

Madagascar Documentation Madagascar programs 39 

bash$ sfmath < data1.rsf two=data2.rsf output=’input-two’ > diff.rsf 

In both cases, the size and shape of data1.rsf and data2.rsf hypercubes should be 

the same, and a warning message is printed out if the the axis sampling parameters 

(such as o1 or d1) in these files are different. 

Implementation: system/main/add.c 

The first input file is either in the list or in the standard input. 

system/main/add.c 

103 /∗ f i n d number o f input f i l e s ∗/ 

104 i f ( i s a t t y ( f i l e n o ( s t d i n ) ) ) { 

105 /∗ no input f i l e in s t d i n ∗/ 

106 nin =0; 

107 } else { 

108 in [ 0 ] = s f i n p u t ( ” in ” ) ; 

109 nin =1; 

110 } 

Collect input files in the in array from all command-line parameters that don’t 

contain an “=” sign. The total number of input files is nin. 


112 for ( i =1; i< argc ; i++) { /∗ c o l l e c t inputs ∗/ 

113 i f (NULL != s t r c h r ( argv [ i ] , ’=’ ) ) continue ; 

114 in [ nin ] = s f i n p u t ( argv [ i ] ) ; 

115 nin++; 

116 } 

117 i f (0==nin ) s f e r r o r ( ”no input ” ) ; 

118 /∗ nin = no o f input f i l e s ∗/ 

A helper function check compat checks the compatibility of input files. 

Finally, we enter the main loop, where the input data are getting read buffer by 

buffer and combined in the total product depending on the data type. 

The data combination program for floating point numbers is add float.



424 check compat ( s f d a t a t y p e type /∗ data type ∗/ , 

425 s i z e t nin /∗ number o f f i l e s ∗/ , 

426 s f f i l e ∗ in /∗ input f i l e s [ nin ] ∗/ , 

427 i n t dim /∗ f i l e d i m e n s i o n a l i t y ∗/ , 

428 const o f f t ∗ n /∗ dimensions [ dim ] ∗/) 

429 /∗ Check that the input f i l e s are compatible . 

430 I s s u e e r r o r for type mismatch or s i z e mismatch . 

431 I s s u e warning for g r i d parameters mismatch . ∗/ 

432 { 

433 i n t ni , id ; 

434 s i z e t i ; 

435 f l o a t d , di , o , o i ; 

436 char key [ 3 ] ; 

437 const f l o a t t o l =1.e −5; /∗ t o l e r a n c e for comparison ∗/ 

438 

439 for ( i =1; i < nin ; i++) { 

440 i f ( s f g e t t y p e ( in [ i ] ) != type ) 

441 s f e r r o r ( ” type mismatch : need %d” , type ) ; 

442 for ( id =1; id t o l ∗ f a b s f (d ) ) ) 

456 s f w a r n i n g ( ”%s mismatch : need %g” , key , d ) ; 

457 } else { 

458 d = 1 . ; 

459 } 

460 ( void ) s n p r i n t f ( key , 3 , ”o%d” , id ) ; 

461 i f ( s f h i s t f l o a t ( in [ 0 ] , key ,&o ) && 

462 ( ! s f h i s t f l o a t ( in [ i ] , key ,& o i ) | | 

463 ( f a b s f ( oi−o ) > t o l ∗ f a b s f ( d ) ) ) ) 

464 s f w a r n i n g ( ”%s mismatch : need %g” , key , o ) ; 

465 } 

466 } 

467 }



183 for ( nbuf /= s f e s i z e ( in [ 0 ] ) ; n s i z > 0 ; n s i z −= nbuf ) { 

184 i f ( nbuf > n s i z ) nbuf=n s i z ; 

185 

186 for ( j =0; j < nin ; j++) { 

187 c o l l e c t = ( bool ) ( j != 0 ) ; 

188 switch ( type ) { 

189 case SF FLOAT: 

190 s f f l o a t r e a d ( ( f l o a t ∗) bufi , 

191 nbuf , 

192 in [ j ] ) ; 

193 a d d f l o a t ( c o l l e c t , 

194 nbuf , 

195 ( f l o a t ∗) buf , 

196 ( const f l o a t ∗) bufi , 

197 cmode , 

198 s c a l e [ j ] , 

199 add [ j ] , 

200 a b s f l a g [ j ] , 

201 l o g f l a g [ j ] , 

202 s q r t f l a g [ j ] , 

203 e x p f l a g [ j ] ) ;



264 s t a t i c void a d d f l o a t ( bool c o l l e c t , /∗ i f c o l l e c t ∗/ 

265 s i z e t nbuf , /∗ b u f f e r s i z e ∗/ 

266 f l o a t ∗ buf , /∗ output [ nbuf ] ∗/ 

267 const f l o a t ∗ bufi , /∗ input [ nbuf ] ∗/ 

268 char cmode , /∗ o p e r a t i o n ∗/ 

269 f l o a t s c a l e , /∗ s c a l e f a c t o r ∗/ 

270 f l o a t add , /∗ add f a c t o r ∗/ 

271 bool a b s f l a g , /∗ i f abs ∗/ 

272 bool l o g f l a g , /∗ i f l o g ∗/ 

273 bool s q r t f l a g , /∗ i f s q r t ∗/ 

274 bool e x p f l a g /∗ i f exp ∗/) 

275 /∗ Add f l o a t i n g point numbers ∗/ 

276 { 

277 s i z e t j ; 

278 f l o a t f ; 

279 

280 for ( j =0; j < nbuf ; j++) { 

281 f = b u f i [ j ] ; 

282 i f ( a b s f l a g ) f = f a b s f ( f ) ; 

283 f += add ; 

284 i f ( l o g f l a g ) f = l o g f ( f ) ; 

285 i f ( s q r t f l a g ) f = s q r t f ( f ) ; 

286 i f ( 1 . != s c a l e ) f ∗= s c a l e ; 

287 i f ( e x p f l a g ) f = expf ( f ) ; 

288 i f ( c o l l e c t ) { 

289 switch ( cmode ) { 

290 case ’ p ’ : /∗ product ∗/ 

291 case ’m’ : /∗ multiply ∗/ 

292 buf [ j ] ∗= f ; 

293 break ; 

294 case ’ d ’ : /∗ d e l e t e ∗/ 

295 i f ( f != 0 . ) buf [ j ] /= f ; 

296 break ; 

297 d e f a u l t : /∗ add ∗/ 

298 buf [ j ] += f ; 

299 break ; 

300 } 

301 } else { 

302 buf [ j ] = f ; 

303 } 

304 } 

305 }


sfattr: Display dataset attributes. 

sfattr < in.rsf lval=2 want= 

Sample output from "sfspike n1=100 | sfbandpass fhi=60 | sfattr" 

******************************************* 

rms = 0.992354 

mean = 0.987576 

2-norm = 9.92354 

variance = 0.00955481 

std dev = 0.0977487 

max = 1.12735 at 97 

min = 0.151392 at 100 

nonzero samples = 100 

total samples = 100 

******************************************* 

rms = sqrt[ sum(data^2) / n ] 

mean 

= sum(data) / n 

norm 

= sum(abs(data)^lval)^(1/lval) 

variance = [ sum(data^2) - n*mean^2 ] / [ n-1 ] 

standard deviation = sqrt [ variance ] 

int lval=2 norm option, lval is a non-negative integer, 

computes the vector lval-norm 

string want= ’all’(default), ’rms’, ’mean’, ’norm’, ’var’, 

’std’, ’max’, ’min’, ’nonzero’, ’samples’, 

’short’ want= ’rms’ displays the 

root mean square want= ’norm’ displays 

the square norm, otherwise specified by 

lval. want= ’var’ displays the variance 

want= ’std’ displays the standard deviation 

want= ’nonzero’ displays number of 

nonzero samples want= ’samples’ displays 

total number of samples want= ’short’ 

displays a short one-line version 

sfattr is a useful diagnostic program. It reports certain statistical values for an 

RSF dataset: RMS (root-mean-square) amplitude, mean value, norm value, variance, 

standard deviation, maximum and minimum values, number of nonzero samples, and 

the total number of samples. 

√ If we denote data values as d i for i = 0, 1, 2, . . . , n, then √ the RMS value is 

n∑ 

1 

d 2 n i , the mean value is n∑ 

∑ n 1 d 

n i , the L 2 -norm value is d 2 i , the variance 

i=0[ 

i=0 

i=0 

( 

n∑ 

n 

) 

∑ 2 

1 

is d 2 n−1 i − 1 d 

n i 

], and the standard deviation is the square root of the 

i=0 

i=0 

variance. Using sfattr is a quick way to see the distribution of data values and 

check it for anomalies.


Implementation: system/main/attr.c 

Computations start by finding the input data (in) size (nsiz) and dimensions (dim). 

system/main/attr.c 

81 dim = ( s i z e t ) s f l a r g e f i l e d i m s ( in , n ) ; 

82 for ( n s i z =1, i =0; i < dim ; i++) { 

83 n s i z ∗= n [ i ] ; 

84 } 

In the main loop, we read the input data buffer by buffer. 


100 for ( n l e f t=n s i z ; n l e f t > 0 ; n l e f t −= nbuf ) { 

101 nbuf = ( b u f s i z < n l e f t )? b u f s i z : n l e f t ; 

102 switch ( type ) { 

103 case SF FLOAT : 

104 s f f l o a t r e a d ( ( f l o a t ∗) buf , nbuf , in ) ; 

105 break ; 

106 case SF INT : 

107 s f i n t r e a d ( ( i n t ∗) buf , nbuf , in ) ; 

108 break ; 

109 case SF SHORT: 

110 s f s h o r t r e a d ( ( s h o r t ∗) buf , nbuf , in ) ; 

111 break ; 

112 case SF COMPLEX: 

113 sf complexread ( ( sf complex ∗) buf , nbuf , in ) ; 

114 break ; 

115 case SF UCHAR: 

116 s f u c h a r r e a d ( ( unsigned char ∗) buf , nbuf , in ) ; 

117 break ; 

118 case SF CHAR: 

119 d e f a u l t : 

120 s f c h a r r e a d ( buf , nbuf , in ) ; 

121 break ; 

122 } 

The data attributes are accumulated in corresponding double-precision variables. 

Finally, the attributes are reduced and printed out.



146 fsum += f ; 

147 f s q r += ( double ) f ∗ f ; 

180 fmean = fsum/ n s i z ; 


181 i f ( l v a l ==2) fnorm = s q r t ( f s q r ) ; 

182 else i f ( l v a l ==0) fnorm = nsiz −nzero ; 

183 else fnorm = pow( f l v a l , 1 . / l v a l ) ; 

184 frms = s q r t ( f s q r / n s i z ) ; 

185 i f ( n s i z > 1) f v a r = f a b s ( f s q r −n s i z ∗fmean∗fmean ) / ( nsiz −1); 

186 else f v a r = 0 . 0 ; 

187 f s t d = s q r t ( f v a r ) ; 


194 i f (NULL==want | | 0==strcmp ( want , ”rms” ) ) 

195 p r i n t f ( ” rms = %13.6g \n” , ( f l o a t ) frms ) ; 

196 i f (NULL==want | | 0==strcmp ( want , ”mean” ) ) 

197 p r i n t f ( ” mean = %13.6g \n” , ( f l o a t ) fmean ) ; 

198 i f (NULL==want | | 0==strcmp ( want , ”norm” ) ) 

199 p r i n t f ( ” %d−norm = %13.6g \n” , l v a l , ( f l o a t ) fnorm ) ; 

200 i f (NULL==want | | 0==strcmp ( want , ” var ” ) ) 

201 p r i n t f ( ” v a r i a n c e = %13.6g \n” , ( f l o a t ) f v a r ) ; 

202 i f (NULL==want | | 0==strcmp ( want , ” std ” ) ) 

203 p r i n t f ( ” std dev = %13.6g \n” , ( f l o a t ) f s t d ) ;


sfcat: Concatenate datasets. 

sfcat > out.rsf order= space= axis=3 nspace=(int) (ni/(20*nin) + 1) o= d= 

[ one.rsf 

bash$ sfin one.rsf 

one.rsf: 

in="/tmp/one.rsf@" 

esize=4 type=float form=native 

n1=2 d1=0.004 o1=0 label1="Time" unit1="s" 

n2=3 d2=0.1 o2=0 label2="Distance" unit2="km" 

6 elements 24 bytes 

bash$ sfcat one.rsf one.rsf axis=1 > two.rsf 

bash$ sfin two.rsf 

two.rsf: 

in="/tmp/two.rsf@" 





Example of sfmerge: 

bash$ sfmerge one.rsf one.rsf axis=2 > two.rsf 


two.rsf: 

in="/tmp/two.rsf@"






In this case, an extra empty trace is inserted between the two merged files. 

The axes that are not being merged are checked for consistency: 

bash$ sfcat one.rsf two.rsf > three.rsf 

sfcat: n2 mismatch: need 3 

Implementation: system/main/cat.c 

The first input file is either in the list or in the standard input. 

system/main/cat.c 

64 i f ( ! s f s t d i n ( ) ) { /∗ no input f i l e in s t d i n ∗/ 

65 nin =0; 

66 } else { 

67 f i l e n a m e [ 0 ] = ” in ” ; 

68 nin =1; 

69 } 

Everything on the command line that does not contain a “=” sign is treated as a 

file name, and the corresponding file object is added to the list. 


71 for ( i =1; i< argc ; i++) { /∗ c o l l e c t inputs ∗/ 

72 i f (NULL != s t r c h r ( argv [ i ] , ’=’ ) ) 

73 continue ; /∗ not a f i l e ∗/ 

74 f i l e n a m e [ nin ] = argv [ i ] ; 

75 nin++; 

76 } 

77 i f (0==nin ) s f e r r o r ( ”no input ” ) ; 

As explained above, if the space= parameter is not set, it is inferred from the 

program name: sfmerge corresponds to space=y and sfcat corresponds to space=n. 

Find the axis for the merging (from the command line axis= argument) and figure 

out two sizes: n1 for everything after the axis and n2 for everything before the axis.



99 i f ( ! s f g e t b o o l ( ” space ”,& space ) ) { 

100 /∗ I n s e r t a d d i t i o n a l space . 

101 y i s d e f a u l t for sfmerge , n i s d e f a u l t for s f c a t ∗/ 

102 prog = s f g e t p r o g ( ) ; 

103 i f (NULL != s t r s t r ( prog , ”merge” ) ) { 

104 space = true ; 

105 } else i f (NULL != s t r s t r ( prog , ” cat ” ) ) { 

106 space = f a l s e ; 

107 } else { 

108 s f w a r n i n g ( ”%s i s n e i t h e r merge nor cat , ” 

109 ” assume merge” , prog ) ; 

110 space = true ; 

111 } 

112 } 

132 n1=1; 

133 n2=1; 


134 for ( i =1; i a x i s ) n2 ∗= n [ i −1]; 

137 }


In the output, the selected axis will get extended. 


149 /∗ f i g u r e out the length o f extended a x i s ∗/ 

150 ni = 0 ; 

151 for ( j =0; j < nin ; j++) { 

152 ni += naxis [ j ] ; 

153 } 

154 

155 i f ( space ) { 

156 i f ( ! s f g e t i n t ( ” nspace ”,& nspace ) ) 

157 nspace = ( i n t ) ( ni /(20∗ nin ) + 1 ) ; 

158 /∗ i f space=y , number o f t r a c e s to i n s e r t ∗/ 

159 ni += nspace ∗( nin −1); 

160 } 

161 

162 ( void ) s n p r i n t f ( key , 3 , ”n%d” , a x i s ) ; 

163 s f p u t i n t ( out , key , ( i n t ) ni ) ; 

The rest is simple: loop through the datasets reading and writing the data in 

buffer-size chunks and adding extra empty chunks if space=y. 

sfcmplx: Create a complex dataset from its real and imaginary 

parts. 

sfcmplx < real.rsf > cmplx.rsf real.rsf imag.rsf 

There has to be only two input files specified and no additional parameters. 

sfcmplx simply creates a complex dataset from its real and imaginary parts. The 

reverse operation can be accomplished with sfreal and sfimag. 

Example of sfcmplx: 

bash$ sfspike n1=2 n2=3 > one.rsf 


one.rsf: 





6 elements 24 bytes



184 for ( i 2 =0; i 2 < n2 ; i 2++) { 

185 for ( j =0; j < nin ; j++) { 

186 k = order [ j ] ; 

187 for ( ni = n1∗ naxis [ k ] ∗ e s i z e ; ni > 0 ; ni −= nbuf ) { 

188 nbuf = (BUFSIZ < ni )? BUFSIZ : ni ; 

189 s f c h a r r e a d ( buf , nbuf , in [ k ] ) ; 

190 s f c h a r w r i t e ( buf , nbuf , out ) ; 

191 } 

192 i f ( ! space | | j == nin −1) continue ; 

193 /∗ Add spaces ∗/ 

194 memset ( buf , 0 , BUFSIZ ) ; 

195 for ( ni = n1∗ nspace ∗ e s i z e ; ni > 0 ; ni −= nbuf ) { 

196 nbuf = (BUFSIZ < ni )? BUFSIZ : ni ; 

197 s f c h a r w r i t e ( buf , nbuf , out ) ; 

198 } 

199 } 

200 } 

bash$ sfcmplx one.rsf one.rsf > cmplx.rsf 

bash$ sfin cmplx.rsf 

cmplx.rsf: 

in="/tmp/cmplx.rsf@" 

esize=8 type=complex form=native 




Implementation: system/main/cmplx.c 

The program flow is simple. First, get the names of the input files. 

The main part of the program reads the real and imaginary parts buffer by buffer 

and assembles and writes out the complex input. 

sfconjgrad: Generic conjugate-gradient solver for linear inversion 

sfconjgrad < dat.rsf mod=mod.rsf > to.rsf < from.rsf > out.rsf niter=1 

file mod= auxiliary input file name 

int niter=1 number of iterations


system/main/cmplx.c 

41 /∗ the f i r s t two non−parameters are r e a l and imag f i l e s ∗/ 

42 for ( i =1; i< argc ; i++) { 

43 i f (NULL == s t r c h r ( argv [ i ] , ’=’ ) ) { 

44 i f (NULL == r e a l ) { 

45 r e a l = s f i n p u t ( argv [ i ] ) ; 

46 } else { 

47 imag = s f i n p u t ( argv [ i ] ) ; 

48 break ; 

49 } 

50 } 

51 } 

52 i f (NULL == imag ) { 

53 i f (NULL == r e a l ) s f e r r o r ( ” not enough input ” ) ; 

54 /∗ i f only one input , r e a l i s in s t d i n ∗/ 

55 imag = r e a l ; 

56 r e a l = s f i n p u t ( ” in ” ) ; 

57 } 

system/main/cmplx.c 

81 for ( n l e f t= ( s i z e t ) ( r s i z e ∗ r e s i z e ) ; 

82 n l e f t > 0 ; n l e f t −= nbuf ) { 

83 nbuf = (BUFSIZ < n l e f t )? BUFSIZ : n l e f t ; 

84 s f c h a r r e a d ( rbuf , nbuf , r e a l ) ; 

85 s f c h a r r e a d ( ibuf , nbuf , imag ) ; 

86 for ( i =0; i < nbuf ; i += r e s i z e ) { 

87 memcpy( cbuf+2∗i , rbuf+i , ( s i z e t ) r e s i z e ) ; 

88 memcpy( cbuf+2∗ i+r e s i z e , i b u f+i , ( s i z e t ) r e s i z e ) ; 

89 } 

90 s f c h a r w r i t e ( cbuf ,2∗ nbuf , cmplx ) ; 

91 }


sfconjgrad is a generic program for least-squares linear inversion with the conjugategradient 

method. Suppose you have an executable program that takes an RSF 

file from the standard input and produces an RSF file in the standard output. It may 

take any number of additional parameters but one of them must be adj= that sets the 

forward (adj=0) or adjoint (adj=1) operations. The program is typically an 

RSF program but it could be anything (a script, a multiprocessor MPI program, etc.) 

as long as it implements a linear operator L and its adjoint. There are no restrictions 

on the data size or shape. You can easily test the adjointness with sfdottest. The 

sfconjgrad program searches for a vector m that minimizes the least-square misfit 

‖d − L m‖ 2 for the given input data vector d. 

Here is an example. The sfhelicon program implements Claerbout’s multidimensional 

helical filtering (Claerbout, 1998). It requires a filter to be specified in 

addition to the input and output vectors. We create a helical 2-D filter using the 

Unix echo command. 

bash$ echo 1 19 20 n1=3 n=20,20 data_format=ascii_int in=lag.rsf > lag.rsf 

bash$ echo 1 1 1 a0=-3 n1=3 data_format=ascii_float in=flt.rsf > flt.rsf 

Next, we create an example 2-D model and data vector with sfspike. 

bash$ sfspike n1=50 n2=50 > vec.rsf 

The sfdottest program can perform the dot product test to check that the adjoint 

mode works correctly. 

bash$ sfdottest sfhelicon filt=flt.rsf lag=lag.rsf \ 

mod=vec.rsf dat=vec.rsf 

sfdottest: L[m]*d=5.28394 

sfdottest: L’[d]*m=5.28394 

Your numbers may be different because sfdottest generates new random input on 

each run. Next, let us make some random data with sfnoise. 

bash$ sfnoise seed=2005 rep=y < vec.rsf > dat.rsf 

and try to invert the filtering operation using sfconjgrad: 

bash$ sfconjgrad sfhelicon filt=flt.rsf lag=lag.rsf \ 

mod=vec.rsf < dat.rsf > mod.rsf niter=10 

sfconjgrad: iter 1 of 10 

sfconjgrad: grad=3253.65 


sfconjgrad: grad=289.421


















The output shows that, in 10 iterations, the norm of the gradient vector decreases by 

almost 1000. We can check the residual misfit before 

bash$ < dat.rsf sfattr want=norm 

norm value = 49.7801 

and after 

bash$ sfhelicon filt=flt.rsf lag=lag.rsf < mod.rsf | \ 

sfadd scale=1,-1 dat.rsf | sfattr want=norm 

norm value = 5.73563 

In 10 iterations, the misfit decreased by an order of magnitude. The result can be 

improved by running the program for more iterations. 

Implementation: system/main/conjgrad.c 

sfcp: Copy or move a dataset. 

sfcp < in.rsf > out.rsf in.rsf out.rsf 

sfcp - copy, sfmv - move. 

Mimics standard Unix commands. 

The sfcp and sfmv command imitate the Unix cp and mv commands and serve 

for copying and moving RSF files. Example:




one.rsf: 






bash$ sfcp one.rsf two.rsf 


two.rsf: 

in="/tmp/two.rsf@" 





Implementation: system/main/cp.c 

First, we look for the two first command-line arguments that don’t have the “=” 

character in them and consider them as the names of the input and the output files. 

system/main/cp.c 

47 /∗ the f i r s t two non−parameters are in and out f i l e s ∗/ 

48 for ( i =1; i< argc ; i++) { 

49 i f (NULL == s t r c h r ( argv [ i ] , ’=’ ) ) { 

50 i f (NULL == in ) { 

51 i n f i l e = argv [ i ] ; 

52 in = s f i n p u t ( i n f i l e ) ; 

53 } else { 

54 out = s f o u t p u t ( argv [ i ] ) ; 

55 break ; 

56 } 

57 } 

58 } 

Next, we use library functions sf_cp and sf_mv to do the actual work.


66 s f c p ( in , out ) ; 

system/main/cp.c 

67 i f (NULL != s t r s t r ( prog , ”mv” ) ) 

68 sf rm ( i n f i l e , f a l s e , f a l s e , f a l s e ) ; 

sfcut: Zero a portion of the dataset. 

sfcut < in.rsf > out.rsf verb=n j#=(1,...) d#=(d1,d2,...) f#=(0,...) 

min#=(o1,o2,,...) n#=(0,...) max#=(o1+(n1-1)*d1,o2+(n1-1)*d2,,...) 

Reverse of window. 

float d#=(d1,d2,...) sampling in #-th dimension 

largeint f#=(0,...) window start in #-th dimension 

int j#=(1,...) jump in #-th dimension 

float 

max#=(o1+(n1- 

maximum in #-th dimension 

1)*d1,o2+(n1- 

1)*d2,,...) 

float min#=(o1,o2,,...) minimum in #-th dimension 

int n#=(0,...) window size in #-th dimension 

bool verb=n [y/n] Verbosity flag 

The sfcut command is related to sfwindow and has the same set of arguments 

only instead of extracting the selected window, it fills it with zeroes. The size of the 

input data is preserved. 

Examples: 

bash$ sfspike n1=5 n2=5 > in.rsf 

bash$ < in.rsf sfdisfil 

0: 1 1 1 1 1 

5: 1 1 1 1 1 

10: 1 1 1 1 1 

15: 1 1 1 1 1 

20: 1 1 1 1 1 

bash$ < in.rsf sfcut n1=2 f1=1 n2=3 f2=2 | sfdisfil 

0: 1 1 1 1 1 

5: 1 1 1 1 1 

10: 1 0 0 1 1 

15: 1 0 0 1 1 

20: 1 0 0 1 1 

bash$ < in.rsf sfcut j1=2 | sfdisfil 

0: 0 1 0 1 0 

5: 0 1 0 1 0


10: 0 1 0 1 0 

15: 0 1 0 1 0 

20: 0 1 0 1 0 

sfdd: Convert between different formats. 

sfdd < in.rsf > out.rsf trunc=n line=8 ibm=n form= type= format= 

string form= ascii, native, xdr 

string format= Element format (for conversion to ASCII) 

bool ibm=n [y/n] Special case - assume integers actually 

represent IBM floats 

int line=8 Number of numbers per line (for conversion 

to ASCII) 

bool trunc=n [y/n] Truncate or round to nearest when converting 

from float to int/short 

string type= int, float, complex, short 

The sfdd program is used to change either the form (ascii, xdr, native) or the 

type (complex, float, int, char) of the input dataset. 

In the example below, we create a plain text (ASCII) file with numbers and then 

use sfdd to generate an RSF file in xdr form with complex numbers. 

bash$ cat test.txt 

1 2 3 4 5 6 

bash$ echo n1=6 data_format=ascii_int in=test.txt > test.rsf 

bash$ sfin test.rsf 

test.rsf: 

in="test.txt" 

esize=0 type=int form=ascii 

n1=6 d1=? o1=? 

6 elements 

bash$ sfdd < test.rsf form=xdr type=complex > test2.rsf 

bash$ sfin test2.rsf 

test2.rsf: 

in="/tmp/test2.rsf@" 

esize=8 type=complex form=xdr 

n1=3 d1=? o1=? 


bash$ sfdisfil < test2.rsf 

0: 1, 2i 3, 4i 5, 6i 

To learn more about the RSF data format, consult the guide to RSF format.


sfdisfil: Print out data values. 

sfdisfil < in.rsf number=y col=0 format= header= trailer= 

Alternatively, use sfdd and convert to ASCII form. 

int col=0 Number of columns. The default depends 

on the data type: 10 for int and char, 5 

for float, 3 for complex 

string format= Format for numbers (printf-style). The 

default depends on the data type: ””” 

string header= Optional header string to output before 

data 

bool number=y [y/n] If number the elements 

string trailer= Optional trailer string to output after 

data 

The sfdisfil program simply dumps the data contents to the standard output 

in a text form. It is used mostly for debugging purposes to quickly examine RSF files. 

Here is an example: 

bash$ sfmath o1=0 d1=2 n1=12 output=x1 > test.rsf 

bash$ < test.rsf sfdisfil 

0: 0 2 4 6 8 

5: 10 12 14 16 18 

10: 20 22 

The output format is easily configurable. 

bash$ < test.rsf sfdisfil col=6 number=n format="%5.1f" 

0.0 2.0 4.0 6.0 8.0 10.0 

12.0 14.0 16.0 18.0 20.0 22.0 

Along with sfdd, sfdisfil provides a simple way to convert RSF data to an ASCII 

form. 

sfdottest: Generic dot-product test for linear operators with 

adjoints 

sfdottest mod=mod.rsf dat=dat.rsf > pip.rsf 

file dat= auxiliary input file name 

file mod= auxiliary input file name 

sfdottest is a generic dot-product test program for testing linear operators. Suppose 

there is an executable program that takes an RSF file from the standard


input and produces an RSF file in the standard output. It may take any number of 

additional parameters but one of them must be adj= that sets the forward (adj=0) 

or adjoint (adj=1) operations. The program is typically an RSF program 

but it could be anything (a script, a multiprocessor MPI program, etc.) as long as 

it implements a linear operator L and its adjoint L T . The sfdottest program is 

testing the equality 

d T L m = m T L T d (1) 

by using random vectors m and d. You can invoke it with 

bash$ sfdottest [optional aruments] mod=mod.rsf dat=dat.rsf 

where mod.rsf and dat.rsf are RSF files that represent vectors from the model and 

data spaces. sfdottest does not create any temporary files and does not have any 

restrictive limitations on the size of the vectors. 

Here is an example. We first setup a vector with 100 elements using sfspike and 

then run sfdottest to test the sfcausint program. sfcausint implements a linear 

operator of causal integration and its adjoint, the anti-causal integration. 

bash$ sfspike n1=100 > vec.rsf 

bash$ sfdottest sfcausint mod=vec.rsf dat=vec.rsf 



bash$ sfdottest sfcausint mod=vec.rsf dat=vec.rsf 



The numbers are different on subsequent runs because of changing seed in the random 

number generator. 

Here is a somewhat more complicated example. The sfhelicon program implements 

Claerbout’s multidimensional helical filtering (Claerbout, 1998). It requires a 

filter to be specified in addition to the input and output vectors. We create a helical 

2-D filter using the Unix echo command. 

bash$ echo 1 19 20 n1=3 n=20,20 data_format=ascii_int in=lag.rsf > lag.rsf 

bash$ echo 1 1 1 a0=-3 n1=3 data_format=ascii_float in=flt.rsf > flt.rsf 

Next, we create an example 2-D model and data vector with sfspike. 

bash$ sfspike n1=50 n2=50 > vec.rsf 

Now the sfdottest program can perform the dot product test.



> mod=vec.rsf dat=vec.rsf 



Here is the same program tested in the inverse filtering mode: 


> mod=vec.rsf dat=vec.rsf div=y 



sfget: Output parameters from the header. 

sfget parform=y all=n par1 par2 ... 

bool all=n [y/n] If output all values. 

bool parform=y [y/n] If y, print out parameter=value. If n, 

print out value. 

The sfget program extracts a parameter value from an RSF file. It is useful 

mostly for scripting. Here is, for example, a quick calculation of the maximum value 

on the first axis in an RSF dataset (the output of sfspike) using the standard Unix 

bc calculator. 

bash$ ( sfspike n1=100 | sfget n1 d1 o1; echo "o1+(n1-1)*d1" ) | bc 

.396 

See also sfput. 

Implementation: system/main/get.c 

The implementation is trivial. Loop through all command-line parameters that contain 

the “=” character. 

system/main/get.c 

41 i f ( ! s f g e t b o o l ( ” a l l ”,& a l l ) ) a l l=f a l s e ; 

42 /∗ I f output a l l v a l u e s . ∗/ 

Get the parameter value (as string) and output it as either key=value or value, 

depending on the parform parameter.


system/main/get.c 

44 t a b l e = s f s i m t a b i n i t ( t a b s i z e ) ; 

45 s f s i m t a b i n p u t ( table , stdin , NULL) ; 

46 

47 i f ( a l l ) { 

48 s f s i m t a b o u t p u t ( table , stdout ) ; 

49 } else { 

50 for ( i = 1 ; i < argc ; i++) { 

sfheadercut: Zero a portion of a dataset based on a header 

mask. 

sfheadercut mask=head.rsf < in.rsf > out.rsf 

The input data is a collection of traces n1xn2, 

mask is an integer array of size n2. 

file mask= auxiliary input file name 

sfheadercut is close to sfheaderwindow but instead of windowing the dataset, 

it fills the traces specified by the header mask with zeroes. The size of the input data 

is preserved. 

Here is an example of using sfheaderwindow for zeroing every other trace in the 

input file. First, let us create an input file with ten traces: 

bash$ sfmath n1=5 n2=10 output=x2+1 > input.rsf 

bash$ < input.rsf sfdisfil 

0: 1 1 1 1 1 

5: 2 2 2 2 2 

10: 3 3 3 3 3 

15: 4 4 4 4 4 

20: 5 5 5 5 5 

25: 6 6 6 6 6 

30: 7 7 7 7 7 

35: 8 8 8 8 8 

40: 9 9 9 9 9 

45: 10 10 10 10 10 

Next, we can create a mask with alternating ones and zeros using sfinterleave. 

bash$ sfspike n1=5 mag=1 | sfdd type=int > ones.rsf


bash$ sfspike n1=5 mag=0 | sfdd type=int > zeros.rsf 

bash$ sfinterleave axis=1 ones.rsf zeros.rsf > mask.rsf 

bash$ sfdisfil < mask.rsf 

0: 1 0 1 0 1 0 1 0 1 0 

Finally, sfheadercut zeros the input traces. 

bash$ sfheadercut < input.rsf mask=mask.rsf > output.rsf 

bash$ sfdisfil < output.rsf 

0: 1 1 1 1 1 

5: 0 0 0 0 0 

10: 3 3 3 3 3 

15: 0 0 0 0 0 

20: 5 5 5 5 5 

25: 0 0 0 0 0 

30: 7 7 7 7 7 

35: 0 0 0 0 0 

40: 9 9 9 9 9 

45: 0 0 0 0 0 

sfheadersort: Sort a dataset according to a header key. 

sfheadersort < in.rsf > out.rsf head= 

string head= header file 

sfheadersort is used to sort traces in the input file according to trace header 

information. 

Here is an example of using sfheadersort for randomly shuffling traces in the 

input file. First, let us create an input file with seven traces: 



0: 1 1 1 1 1 

5: 2 2 2 2 2 

10: 3 3 3 3 3 

15: 4 4 4 4 4 

20: 5 5 5 5 5 

25: 6 6 6 6 6 

30: 7 7 7 7 7 

Next, we can create a random file with seven header values using sfnoise.


bash$ sfspike n1=7 | sfnoise rep=y type=n > random.rsf 

bash$ < random.rsf sfdisfil 

0: 0.05256 -0.2879 0.1487 0.4097 0.1548 

5: 0.4501 0.2836 

If you reproduce this example, your numbers will most likely be different, because, 

in the absence of seed= parameter, sfnoise uses a random seed value to generate 

pseudo-random numbers. Finally, we apply sfheadersort to shuffle the input traces. 

bash$ < input.rsf sfheadersort head=random.rsf > output.rsf 

bash$ < output.rsf sfdisfil 

0: 2 2 2 2 2 

5: 1 1 1 1 1 

10: 3 3 3 3 3 

15: 5 5 5 5 5 

20: 7 7 7 7 7 

25: 4 4 4 4 4 

30: 6 6 6 6 6 

As expected, the order of traces in the output file corresponds to the order of values 

in the header. Thanks to the separation between headers and data, the operation of 

sfheadersort is optimally efficient. It first sorts the headers and only then accesses 

the data, reading each data trace only once. 

sfheaderwindow: Window a dataset based on a header mask. 

sfheaderwindow mask=head.rsf < in.rsf > out.rsf 

The input data is a collection of traces n1xn2, 

mask is an integer array os size n2, windowed is n1xm2, 

where m2 is the number of nonzero elements in mask. 

file mask= auxiliary input file name 

sfheaderwindow is used to window traces in the input file according to trace 

header information. 

Here is an example of using sfheaderwindow for randomly selecting part of the 

traces in the input file. First, let us create an input file with ten traces: 



0: 1 1 1 1 1 

5: 2 2 2 2 2 

10: 3 3 3 3 3


15: 4 4 4 4 4 

20: 5 5 5 5 5 

25: 6 6 6 6 6 

30: 7 7 7 7 7 

35: 8 8 8 8 8 

40: 9 9 9 9 9 

45: 10 10 10 10 10 

Next, we can create a random file with ten header values using sfnoise. 

bash$ sfspike n1=10 | sfnoise rep=y type=n > random.rsf 

bash$ < random.rsf sfdisfil 

0: -0.005768 0.02258 -0.04331 -0.4129 -0.3909 

5: -0.03582 0.4595 -0.3326 0.498 -0.3517 

If you reproduce this example, your numbers will most likely be different, because, 

in the absence of seed= parameter, sfnoise uses a random seed value to generate 

pseudo-random numbers. Finally, we apply sfheaderwindow to window the input 

traces selecting only those for which the header is greater than zero. 

bash$ < random.rsf sfmask min=0 > mask.rsf 

bash$ < mask.rsf sfdisfil 

0: 0 1 0 0 0 0 1 0 1 0 

bash$ < input.rsf sfheaderwindow mask=mask.rsf > output.rsf 

bash$ < output.rsf sfdisfil 

0: 2 2 2 2 2 

5: 7 7 7 7 7 

10: 9 9 9 9 9 

In this case, only three traces are selected for the output. Thanks to the separation 

between headers and data, the operation of sfheaderwindow is optimally efficient. 

sfin: Display basic information about RSF files. 

sfin info=y check=2. trail=y [


sfin is one of the most useful programs for operating with RSF files. It produces 

quick information on the file hypercube dimensions and checks the consistency of the 

associated data file. 

Here is an example. Let us create an RSF file and examine it with sfin. 

bash$ sfspike n1=100 n2=20 > spike.rsf 

bash$ sfin spike.rsf 

spike.rsf: 

in="/tmp/spike.rsf@" 





sfin reports the following information: 

• location of the data file (/tmp/spike.rsf) 

• element size (4 bytes) 

• element type (floating point) 

• element form (native) 

• hypercube dimensions (100 by 20) 

• axes scale (0.004 and 0.1) 

• axes origin (0 and 0) 

• axes labels 

• axes units 

• total number of elements 

• total number of bytes in the data file 

Suppose that the file got corrupted by a buggy program and reports incorrect 

dimensions. The sfin program should be able to catch the discrepancy. 

bash$ echo n2=100 >> spike.rsf 

bash$ sfin spike.rsf > /dev/null 

sfin: 

Actually 8000 bytes, 20% of expected. 

sfin also checks the first records in the file for zeros.


bash$ sfspike n1=100 n2=100 k2=99 > spike2.rsf 

bash$ sfin spike2.rsf >/dev/null 

sfin: The first 32768 bytes are all zeros 

The number of bytes to check is adjustable 

bash$ sfin spike2.rsf check=0.01 >/dev/null 

sfin: The first 16384 bytes are all zeros 

You can also output only the location of the data file. This is sometimes handy 

in scripts. 

bash$ sfin spike.rsf spike2.rsf info=n 

/tmp/spike.rsf@ /tmp/spike2.rsf@ 

An alternative is to use sfget, as follows: 

bash$ sfget parform=n in < spike.rsf 

/tmp/spike.rsf@ 

sfinterleave: Combine several datasets by interleaving. 

sfinterleave > out.rsf axis=3 [< file0.rsf] file1.rsf file2.rsf ... 

int axis=3 Axis for interleaving 

sfinterleave combines two or more datasets by interleaving them on one of the 

axes. Here is a quick example: 


bash$ sfdisfil < one.rsf 

0: 1 1 1 1 1 

5: 1 1 1 1 1 

10: 1 1 1 1 1 

15: 1 1 1 1 1 

20: 1 1 1 1 1 

bash$ sfscale < one.rsf dscale=2 > two.rsf 

bash$ sfdisfil < two.rsf 

0: 2 2 2 2 2 

5: 2 2 2 2 2 

10: 2 2 2 2 2 

15: 2 2 2 2 2 

20: 2 2 2 2 2


bash$ sfinterleave one.rsf two.rsf axis=1 | sfdisfil 

0: 1 2 1 2 1 

5: 2 1 2 1 2 

10: 1 2 1 2 1 

15: 2 1 2 1 2 

20: 1 2 1 2 1 

25: 2 1 2 1 2 

30: 1 2 1 2 1 

35: 2 1 2 1 2 

40: 1 2 1 2 1 

45: 2 1 2 1 2 

bash$ sfinterleave < one.rsf two.rsf axis=2 | sfdisfil 

0: 1 1 1 1 1 

5: 2 2 2 2 2 

10: 1 1 1 1 1 

15: 2 2 2 2 2 

20: 1 1 1 1 1 

25: 2 2 2 2 2 

30: 1 1 1 1 1 

35: 2 2 2 2 2 

40: 1 1 1 1 1 

45: 2 2 2 2 2 

sfmask: Create a mask. 

sfmask < in.rsf > out.rsf min= max= min= max= 

Mask is an integer data with ones and zeros. 

Ones correspond to input values between min and max. 

The output can be used with sfheaderwindow. 

int max= maximum header value 

int min= minimum header value 

sfmask creates an integer output of ones and zeros comparing the values of the 

input data to specified min= and max= parameters. It is useful for sfheaderwindow 

and in many other applications. Here is a quick example: 

bash$ sfmath n1=10 output="sin(x1)" > sin.rsf 

bash$ < sin.rsf sfdisfil 

0: 0 0.8415 0.9093 0.1411 -0.7568 

5: -0.9589 -0.2794 0.657 0.9894 0.4121 

bash$ < sin.rsf sfmask min=-0.5 max=0.5 | sfdisfil 

0: 1 0 0 1 0 0 1 0 0 1


sfmath: Mathematical operations on data files. 

sfmath > out.rsf n#= d#=(1,1,...) o#=(0,0,...) label#= unit#= type= label= unit= 

output= 

Known functions: 

cos, sin, tan, acos, asin, atan, 

cosh, sinh, tanh, acosh, asinh, atanh, 

exp, log, sqrt, abs, 

erf, erfc (for float data), 

arg, conj, real, imag (for complex data). 

sfmath will work on float or complex data, but all the input and output 

files must be of the same data type. 

An alternative to sfmath is sfadd, which may be more efficient, but is 

less versatile. 

Examples: 

sfmath x=file1.rsf y=file2.rsf power=file3.rsf output=’sin((x+2*y)^power)’ > out.rsf 

sfmath < file1.rsf tau=file2.rsf output=’exp(tau*input)’ > out.rsf 

sfmath n1=100 type=complex output="exp(I*x1)" > out.rsf 

Arguments which are not treated as variables in mathematical expressions: 

datapath=, type=, out= 

See also: sfheadermath. 

float d#=(1,1,...) sampling on #-th axis 

string label= data label 

string label#= label on #-th axis 

largeint n#= size of #-th axis 

float o#=(0,0,...) origin on #-th axis 

string output= Mathematical description of the output 

string type= output data type [float,complex] 

string unit= data unit 

string unit#= unit on #-th axis 

sfmath is a versatile program for mathematical operations with RSF files. It can 

operate with several input file, all of the same dimensions and data type. The data 

type can be real (floating point) or complex. Here is an example that demonstrates 

several features of sfmath. 

bash$ sfmath n1=629 d1=0.01 o1=0 n2=40 d2=1 o2=5 \ 

output="x2*(8+sin(6*x1+x2/10))" > rad.rsf 

bash$ < rad.rsf sfrtoc | sfmath output="input*exp(I*x1)" > rose.rsf 

bash$ < rose.rsf sfgraph title=Rose screenratio=1 wantaxis=n | sfpen 

The first line creates a 2-D dataset that consists of 40 traces 629 samples each. The


values of the data are computed with the formula "x2*(8+sin(6*x1+x2/10))", where 

x1 refers to the coordinate on the first axis, and x2 is the coordinate of the second 

axis. In the second line, we convert the data from real to complex using sfrtoc and 

produce a complex dataset using formula "input*exp(I*x1)", where input refers to 

the input file. Finally, we plot the complex data as a collection of parametric curves 

using sfgraph and display the result using sfpen. The plot appearing on your screen 

should look similar to Figure 1. 

Figure 1: This figure was created with sfmath. rsf/sfmath rose 

One possible alternative to the second line above is 

bash$ < rad.rsf sfmath output=x1 > ang.rsf 

bash$ sfmath r=rad.rsf a=ang.rsf output="r*cos(a)" > cos.rsf 

bash$ sfmath r=rad.rsf a=ang.rsf output="r*sin(a)" > sin.rsf 

bash$ sfcmplx cos.rsf sin.rsf > rose.rsf 

Here we refer to input files by names (r and a) and combine the names in a formula.


sfpad: Pad a dataset with zeros. 

sfpad < in.rsf > out.rsf beg#=0 end#=0 

n\#out is equivalent to n\#, both of them overwrite end\#. 

int beg#=0 the number of zeros to add before the beginning 

of #-th axis 

int end#=0 the number of zeros to add after the end 

of #-th axis 

pad increases the dimensions of the input dataset by padding the data with zeroes. 

Here are some simple examples. 


bash$ sfdisfil < one.rsf 

0: 1 1 1 1 1 

5: 1 1 1 1 1 

10: 1 1 1 1 1 

bash$ < one.rsf sfpad n2=5 | sfdisfil 

0: 1 1 1 1 1 

5: 1 1 1 1 1 

10: 1 1 1 1 1 

15: 0 0 0 0 0 

20: 0 0 0 0 0 

bash$ < one.rsf sfpad beg2=2 | sfdisfil 

0: 0 0 0 0 0 

5: 0 0 0 0 0 

10: 1 1 1 1 1 

15: 1 1 1 1 1 

20: 1 1 1 1 1 

bash$ < one.rsf sfpad beg2=1 end2=1 | sfdisfil 

0: 0 0 0 0 0 

5: 1 1 1 1 1 

10: 1 1 1 1 1 

15: 1 1 1 1 1 

20: 0 0 0 0 0 

bash$ < one.rsf sfwindow n1=3 | sfpad n1=5 n2=5 beg1=1 beg2=1 | sfdisfil 

0: 0 0 0 0 0 

5: 0 1 1 1 0 

10: 0 1 1 1 0 

15: 0 1 1 1 0 

20: 0 0 0 0 0 

You can use sfcat to pad data with values other than zeroes.


sfput: Input parameters into a header. 

sfput < in.rsf > out.rsf 

sfput is a very simple program. It simply appends parameters from the command 

line to the output RSF file. One can achieve similar results with editing by hand or 

with standard Unix utilities like sed and echo. sfput is sometimes more convenient 

because it handles input/output operations similarly to other regular RSF programs. 

bash$ sfspike n1=10 > spike.rsf 


spike.rsf: 





bash$ sfput < spike.rsf d1=25 label1=Depth unit1=m > spike2.rsf 

bash$ sfin spike2.rsf 

spike2.rsf: 

in="/tmp/spike2.rsf@" 


n1=10 d1=25 o1=0 label1="Depth" unit1="m" 


sfreal: Extract real (sfreal) or imaginary (sfimag) part of a 

complex dataset. 

sfreal < cmplx.rsf > real.rsf 

sfreal extracts the real part of a complex type dataset. The imaginary part can 

be extracted with sfimag, an the real and imaginary part can be combined together 

with sfcmplx. 

Here is a simple example. Let us first create a complex dataset with sfmath 

bash$ sfmath n1=10 type=complex output="(2+I)*x1" > cmplx.rsf 

bash$ fdisfil < cmplx.rsf 

0: 0, 0i 2, 1i 4, 2i 

3: 6, 3i 8, 4i 10, 5i 

6: 12, 6i 14, 7i 16, 8i 

9: 18, 9i 

Extracting the real part with sfreal:


bash$ sfreal < cmplx.rsf | sfdisfil 

0: 0 2 4 6 8 

5: 10 12 14 16 18 

Extracting the imaginary part with sfimag: 

bash$ sfimag < cmplx.rsf | sfdisfil 

0: 0 1 2 3 4 

5: 5 6 7 8 9 

sfreverse: Reverse one or more axes in the data hypercube. 

sfreverse < in.rsf > out.rsf which=-1 verb=n memsize=sf memsize() opt= 

int memsize=sf memsize() Max amount of RAM (in Mb) to be used 

string opt= If y, change o and d parameters on the 

reversed axis; if i, don’t change o and d 


int which=-1 Which axis to reverse. To reverse a given 

axis, start with 0, add 1 to number to reverse 

n1 dimension, add 2 to number to 

reverse n2 dimension, add 4 to number to 

reverse n3 dimension, etc. Thus, which=7 

would reverse the first three dimensions, 

which=5 just n1 and n3, etc. which=0 

will just pass the input on through unchanged. 

Here is an example of using sfreverse. First, let us create a 2-D dataset. 

bash$ sfmath n1=5 d1=1 n2=3 d2=1 output=x1+x2 > test.rsf 


0: 0 1 2 3 4 

5: 1 2 3 4 5 

10: 2 3 4 5 6 

Reversing the first axis: 

bash$ < test.rsf sfreverse which=1 | sfdisfil 

0: 4 3 2 1 0 

5: 5 4 3 2 1 

10: 6 5 4 3 2 

Reversing the second axis:



0: 2 3 4 5 6 

5: 1 2 3 4 5 

10: 0 1 2 3 4 

Reversing both the first and the second axis: 


0: 2 3 4 5 6 

5: 1 2 3 4 5 

10: 0 1 2 3 4 

As you can see, the which= parameter controls the axes that are being reversed by 

encoding them into one number. 

When an axis is reversed, what happens with its axis origin and sampling parameters? 

This behavior is controlled by opt=. In our example, 

bash$ < test.rsf sfget n1 o1 d1 

n1=5 

o1=0 

d1=1 

bash$ < test.rsf sfreverse which=1 | sfget o1 d1 

o1=4 

d1=-1 

The default behavior (equivalent to opt=y) puts the origin o1 at the end of the axis 

and reverses the sampling parameter d1. Using opt=n preserves the sampling but 

reverses the origin. 

bash$ < test.rsf sfreverse which=1 opt=n | sfget o1 d1 

o1=-4 

d1=1 

Using opt=i preserves both the sampling and the origin while reversing the axis. 

bash$ < test.rsf sfreverse which=1 opt=i | sfget o1 d1 

o1=0 

d1=1 

One of the three possible behaviors may be desirable depending on the application.


sfrm: Remove RSF files together with their data. 

sfrm file1.rsf [file2.rsf ...] [-i] [-v] [-f] 

Mimics the standard Unix rm command. 

See also: sfmv, sfcp. 

sfrm is a program for removing RSF files. Its arguments mimic the arguments of 

the standard Unix rm utility: -v for verbosity, -i for interactive inquiry, -f for force 

removal of suspicious files. Unlike the Unix rm, sfrm removes both the RSF header 

files and the binary files that the headers point to. 

Example: 

bash$ sfspike n1=10 > spike.rsf datapath=./ 

bash$ sfget in < spike.rsf 

in=./spike.rsf@ 

bash$ ls spike* 

spike.rsf spike.rsf@ 

bash$ sfrm -v spike.rsf 

sfrm: sf_rm: Removing header spike.rsf 

sfrm: sf_rm: Removing data ./spike.rsf@ 

bash$ ls spike* 

ls: No match. 

sfrotate: Rotate a portion of one or more axes in the data 

hypercube. 

sfrotate < in.rsf > out.rsf verb=n memsize=sf memsize() rot#=(0,0,...) 


int rot#=(0,0,...) length of #-th axis that is moved to the 

end 


sfrotate modifies the input dataset by splitting it into parts and putting the 

parts back in a different order. Here is a quick example. 

bash$ sfmath n1=5 d1=1 n2=3 d2=1 output=x1+x2 > test.rsf 


0: 0 1 2 3 4 

5: 1 2 3 4 5 

10: 2 3 4 5 6


Rotating the first axis by putting the last two columns in front: 

bash$ < test.rsf sfrotate rot1=2 | sfdisfil 

0: 3 4 0 1 2 

5: 4 5 1 2 3 

10: 5 6 2 3 4 

Rotating the second axis by putting the last row in front: 

bash$ < test.rsf sfrotate rot2=1 | sfdisfil 

0: 2 3 4 5 6 

5: 0 1 2 3 4 

10: 1 2 3 4 5 

Rotating both the first and the second axis: 

bash$ < test.rsf sfrotate rot1=3 rot2=1 | sfdisfil 

0: 4 5 6 2 3 

5: 2 3 4 0 1 

10: 3 4 5 1 2 

The transformation is shown schematically in Figure 2. 

before 

after 

Figure 2: Schematic transformation of data with sfrotate. rsf/XFig rotate 

sfrtoc: Convert real data to complex (by adding zero imaginary 

part). 

sfrtoc < real.rsf > cmplx.rsf 

See also: sfcmplx 

The input to sfrtoc can be any type=float dataset:


bash$ sfspike n1=10 n2=20 n3=30 >real.rsf 

bash$ sfin real.rsf 

real.rsf: 

in="/var/tmp/real.rsf@" 






The output dataset will have type=complex, and its binary will be twice the size of 

the input: 

bash$ complex.rsf 

bash$ sfin complex.rsf 

complex.rsf: 

in="/var/tmp/complex.rsf@" 

esize=8 type=complex form=native 





sfscale: Scale data. 

sfscale < in.rsf > out.rsf axis=0 rscale=0. 

dscale=1. 

To scale by a constant factor, you can also use sfmath. 

int axis=0 Scale by maximum in the dimensions up 

to this axis. 

float dscale=1. Scale by this factor (works if rscale=0) 

float rscale=0. Scale by this factor. 

sfscale scales the input dataset by a factor. Here are some simple examples. 

First, let us create a test dataset. 

bash$ sfmath n1=5 n2=3 o1=1 o2=1 output="x1*x2" > test.rsf 


0: 1 2 3 4 5 

5: 2 4 6 8 10 

10: 3 6 9 12 15 

Scale every data point by 2:


bash$ < test.rsf sfscale dscale=2 | sfdisfil 

0: 2 4 6 8 10 

5: 4 8 12 16 20 

10: 6 12 18 24 30 

Divide every trace by its maximum value: 

bash$ < test.rsf sfscale axis=1 | sfdisfil 

0: 0.2 0.4 0.6 0.8 1 

5: 0.2 0.4 0.6 0.8 1 

10: 0.2 0.4 0.6 0.8 1 

Divide by the maximum value in the whole 2-D dataset: 

bash$ < test.rsf sfscale axis=2 | sfdisfil 

0: 0.06667 0.1333 0.2 0.2667 0.3333 

5: 0.1333 0.2667 0.4 0.5333 0.6667 

10: 0.2 0.4 0.6 0.8 1 

The rscale= parameter is synonymous to dscale= except when it is equal to zero. 

With sfscale dscale=0, the dataset gets multiplied by zero. If using rscale=0, 

the other parameters are used to define scaling. Thus, sfscale rscale=0 axis=1 

is equivalent to sfscale axis=1, and sfscale rscale=0 is equivalent to sfscale 

dscale=1. 

sfspike: Generate simple data: spikes, boxes, planes, constants. 

sfspike < in.rsf > spike.rsf mag= nsp=1 k#=[0,...] l#=[k1,k2,...] p#=[0,...] 

n#= o#=[0,0,...] d#=[0.004,0.1,0.1,...] label#=[Time,Distance,Distance,...] 

unit#=[s,km,km,...] title= 

Spike positioning is given in samples and starts with 1. 

float d#=[0.004,0.1,0.1,...] sampling on #-th axis 

ints k#=[0,...] spike starting position [nsp] 

ints l#=[k1,k2,...] spike ending position [nsp] 

string 

label#=[Time,Distance,Distance,...] label on #-th axis 

floats mag= spike magnitudes [nsp] 

int n#= size of #-th axis 

int nsp=1 Number of spikes 

float o#=[0,0,...] origin on #-th axis 

floats p#=[0,...] spike inclination (in samples) [nsp] 

string title= title for plots 

string unit#=[s,km,km,...] unit on #-th axis


sfspike takes no input and generates an output with “spikes”. It is an easy way 

to create data. Here is an example: 

bash$ sfspike n1=5 n2=3 k1=4 k2=1 | sfdisfil 

0: 0 0 0 1 0 

5: 0 0 0 0 0 

10: 0 0 0 0 0 

The spike location is specified by parameters k1=4 and k2=1. Note that the locations 

are numbered starting from 1. If one of the parameters is omitted or given the value 

of zero, the spike in the corresponding direction becomes a plane: 

bash$ sfspike n1=5 n2=3 k1=4 | sfdisfil 

0: 0 0 0 1 0 

5: 0 0 0 1 0 

10: 0 0 0 1 0 

If no spike parameters are given, the whole dataset is filled with ones: 

bash$ sfspike n1=5 n2=3 | sfdisfil 

0: 1 1 1 1 1 

5: 1 1 1 1 1 

10: 1 1 1 1 1 

To create several spikes, use the nsp= parameter and give a comma-separated list 

of values to k#= arguments: 

bash$ sfspike n1=5 n2=3 nsp=3 k1=1,3,4 k2=1,2,3 | sfdisfil 

0: 1 0 0 0 0 

5: 0 0 1 0 0 

10: 0 0 0 1 0 

If the number of values in the list is smaller than nsp, the last value gets repeated, 

and the spikes add on top of each other, creating larger amplitudes: 

bash$ sfspike n1=5 n2=3 nsp=3 k1=1,3 k2=1,2 | sfdisfil 

0: 1 0 0 0 0 

5: 0 0 2 0 0 

10: 0 0 0 0 0 

The magnitude of the spikes can be controlled explicitly with the mag= parameter:


bash$ sfspike n1=5 n2=3 nsp=3 k1=1,3,4 k2=1,2,3 mag=1,4,2 | sfdisfil 

0: 1 0 0 0 0 

5: 0 0 4 0 0 

10: 0 0 0 2 0 

You can create boxes instead of spikes by using l#= parameters: 

bash$ sfspike n1=5 n2=3 k1=2 l1=4 k2=2 mag=8 | sfdisfil 

0: 0 0 0 0 0 

5: 0 8 8 8 0 

10: 0 0 0 0 0 

In this case, k1=2 specifies the box start, and l1=4 specifies the box end. 

Finally, multi-dimensional planes can be given an inclination by using p#= parameters: 

bash$ sfspike n1=5 n2=3 k1=2 p2=1 | sfdisfil 

0: 0 1 0 0 0 

5: 0 0 1 0 0 

10: 0 0 0 1 0 

When the inclination value is not integer, simple linear interpolation is used: 

bash$ sfspike n1=5 n2=3 k1=2 p2=0.7 | sfdisfil 

0: 0 1 0 0 0 

5: 0 0.3 0.7 0 0 

10: 0 0 0.6 0.4 0 

sfspike supplies default dimensions and labels to all axis: 

bash$ sfspike n1=5 n2=3 n3=4 > spike.rsf 


spike.rsf: 

in="/var/tmp/spike.rsf@" 






As you can see, the first axis is assumed to be time, with sampling of 0.004 seconds. 

All other axes are assumed to be distance, with sampling of 0.1 kilometers. All these 

parameters can be changed on the command line.


bash$ sfspike n1=5 n2=3 n3=4 label3=Offset unit3=ft d3=20 > spike.rsf 


spike.rsf: 





n3=4 d3=20 o3=0 label3="Offset" unit3="ft" 


sfspray: Extend a dataset by duplicating in the specified axis 

dimension. 

sfspray < in.rsf > out.rsf axis=2 n= d= o= label= unit= 

This operation is adjoint to sfstack. 

int axis=2 which axis to spray 

float d= Sampling of the newly created dimension 

string label= Label of the newly created dimension 

int n= Size of the newly created dimension 

float o= Origin of the newly created dimension 

string unit= Units of the newly created dimension 

sfspray extends the input hypercube by replicating the data in one of the dimensions. 

The output dataset acquires one additional dimension. Here is an example: 

Start with a 2-D dataset 

bash$ sfmath n1=5 n2=2 output=x1+x2 > test.rsf 

bash$ sfin test.rsf 

test.rsf: 

in="/var/tmp/test.rsf@" 


n1=5 d1=1 o1=0 

n2=2 d2=1 o2=0 



0: 0 1 2 3 4 

5: 1 2 3 4 5 

Extend the data in the second dimension 

bash$ < test.rsf sfspray axis=2 n=3 > test2.rsf 

bash$ sfin test2.rsf


test2.rsf: 

in="/var/tmp/test2.rsf@" 


n1=5 d1=1 o1=0 

n2=3 d2=1 o2=0 

n3=2 d3=1 o3=0 


bash$ < test2.rsf sfdisfil 

0: 0 1 2 3 4 

5: 0 1 2 3 4 

10: 0 1 2 3 4 

15: 1 2 3 4 5 

20: 1 2 3 4 5 

25: 1 2 3 4 5 

The output is three-dimensional, with traces from the original data duplicated along 

the second axis. 

Extend the data in the third dimension 

bash$ < test.rsf sfspray axis=3 n=2 > test3.rsf 

bash$ sfin test3.rsf 

test3.rsf: 

in="/var/tmp/test3.rsf@" 


n1=5 d1=1 o1=0 

n2=2 d2=1 o2=0 

n3=2 d3=? o3=? 


bash$ < test3.rsf sfdisfil 

0: 0 1 2 3 4 

5: 1 2 3 4 5 

10: 0 1 2 3 4 

15: 1 2 3 4 5 

The output is also three-dimensional, with the original data replicated along the third 

axis.


sfstack: Stack a dataset over one of the dimensions. 

sfstack < in.rsf > out.rsf scale= axis=2 rms=n norm=y min=n max=n prod=n 

This operation is adjoint to sfspray. 

int axis=2 which axis to stack 

bool max=n [y/n] If y, find maximum instead of stack. Ignores 

rms and norm. 

bool min=n [y/n] If y, find minimum instead of stack. Ignores 

rms and norm. 

bool norm=y [y/n] If y, normalize by fold. 

bool prod=n [y/n] If y, find product instead of stack. Ignores 

rms and norm. 

bool rms=n [y/n] If y, compute the root-mean-square instead 

of stack. 

floats scale= optionally scale before stacking [n2] 

While sfspray adds a dimension to a hypercube, sfstack effectively removes one 

of the dimensions by stacking over it. Here are some examples: 

bash$ sfmath n1=5 n2=3 output=x1+x2 > test.rsf 


0: 0 1 2 3 4 

5: 1 2 3 4 5 

10: 2 3 4 5 6 

bash$ < test.rsf sfstack axis=2 | sfdisfil 

0: 1.5 2 3 4 5 

bash$ < test.rsf sfstack axis=1 | sfdisfil 

0: 2.5 3 4 

Why is the first value not 1 (in the first case) or 2 (in the second case)? By default, 

sfstack normalizes the stack by the fold (the number of non-zero entries). To avoid 

normalization, use norm=n, as follows: 

bash$ < test.rsf sfstack norm=n | sfdisfil 

0: 3 6 9 12 15 

sfstack can also compute root-mean-square values as well as minimum and maximum 

values. 

bash$ < test.rsf sfstack rms=y | sfdisfil 

0: 1.581 2.16 3.109 4.082 5.066 

bash$ < test.rsf sfstack min=y | sfdisfil 

0: 0 1 2 3 4 

bash$ < test.rsf sfstack axis=1 max=y | sfdisfil 

0: 4 5 6


sftransp: Transpose two axes in a dataset. 

sftransp < in.rsf > out.rsf memsize=sf memsize() plane= 

If you get a "Cannot allocate memory" error, give the program a 

memsize=1 command-line parameter to force out-of-core operation. 


int plane= Two-digit number with axes to transpose. 

The default is 12 

The sftransp program transposes the input hypercube exchanging the two axes 

specified by the plane= parameter. 

bash$ sfspike n1=10 n2=20 n3=30 > orig123.rsf 

bash$ sfin orig123.rsf 

orig123.rsf: 

in="/var/tmp/orig123.rsf@" 



n2=20 d2=0.1 o2=0 label2="Distance2" unit2="km" 



bash$ out132.rsf 

bash$ sfin out132.rsf 

out132.rsf: 

in="/var/tmp/out132.rsf@" 






bash$ out321.rsf 

bash$ sfin out321.rsf 

out321.rsf: 

in="/var/tmp/out132.rsf@" 






sftransp tries to fit the dataset in memory to transpose it there but, if not enough 

memory is available, it performs a slower transpose out of core using disk operations. 

You can control the amount of available memory using the memsize= parameter or


the RSFMEMSIZE environmental variable. 

sfwindow: Window a portion of a dataset. 

sfwindow < in.rsf > out.rsf verb=n squeeze=y j#=(1,...) d#=(d1,d2,...) 

f#=(0,...) min#=(o1,o2,,...) n#=(0,...) max#=(o1+(n1-1)*d1,o2+(n1-1)*d2,,...) 

float d#=(d1,d2,...) sampling in #-th dimension 

largeint f#=(0,...) window start in #-th dimension 

int j#=(1,...) jump in #-th dimension 

float 

max#=(o1+(n1- 

maximum in #-th dimension 

1)*d1,o2+(n1- 

1)*d2,,...) 

float min#=(o1,o2,,...) minimum in #-th dimension 

largeint n#=(0,...) window size in #-th dimension 

bool squeeze=y [y/n] if y, squeeze dimensions equal to 1 to the 

end 


sfwindow is used to window a portion of the dataset. Here is a quick example: 

Start by creating some data. 

bash$ sfmath n1=5 n2=3 o1=1 o2=1 output="x1*x2" > test.rsf 


0: 1 2 3 4 5 

5: 2 4 6 8 10 

10: 3 6 9 12 15 

Now window the first two rows: 

bash$ < test.rsf sfwindow n2=2 | sfdisfil 

0: 1 2 3 4 5 

5: 2 4 6 8 10 

Window the first three columns: 

bash$ < test.rsf sfwindow n1=3 | sfdisfil 

0: 1 2 3 2 4 

5: 6 3 6 9 

Window the middle row: 

bash$ < test.rsf sfwindow f2=1 n2=1 | sfdisfil 

0: 2 4 6 8 10 

You can interpret the f# and n# parameters as meaning ”skip that many rows/- 

columns” and ”select that many rows/columns” correspondingly. Window the middle 

point in the dataset:


bash$ < test.rsf sfwindow f1=2 n1=1 f2=1 n2=1 | sfdisfil 

0: 6 

Window every other column: 

bash$ < test.rsf sfwindow j1=2 | sfdisfil 

0: 1 3 5 2 6 

5: 10 3 9 15 

Window every third column: 

bash$ < test.rsf sfwindow j1=3 | sfdisfil 

0: 1 4 2 8 3 

5: 12 

Alternatively, sfwindow can use the minimum and maximum parameters to select 

a window. In the following example, we are creating a dataset with sfspike and 

then windowing a portion of it between 1 and 2 seconds in time and sampled at 8 

miliseconds. 

bash$ sfspike n1=1000 n2=10 > spike.rsf 


spike.rsf: 






bash$ < spike.rsf sfwindow min1=1 max1=2 d1=0.008 > window.rsf 

bash$ sfin window.rsfwindow.rsf: 

in="/var/tmp/window.rsf@" 





By default, sfwindow “squeezes” the hypercube dimensions that are equal to one 

toward the end of the dataset. Here is an example of taking a time slice: 

bash$ < spike.rsf sfwindow n1=1 min1=1 > slice.rsf 

bash$ sfin slice.rsf 

slice.rsf: 

in="/var/tmp/slice.rsf@"






You can change this behavior by specifying squeeze=n. 

bash$ < spike.rsf sfwindow n1=1 min1=1 squeeze=n > slice.rsf 

bash$ sfin slice.rsf slice.rsf: 

in="/var/tmp/slice.rsf@" 





REFERENCES 

Claerbout, J., 1998, Multidimensional recursive filters via a helix: Geophysics, 63, 

1532–1541.

86 Fomel Madagascar Documentation


Guide to RSF format 


ABSTRACT 

This guide explains the RSF file format. 

PRINCIPLES 

The main design principle behind the RSF file format is KISS (“Keep It Simple, 

Stupid!”). The RSF format is borrowed from the SEPlib data format originally 

designed at the Stanford Exploration Project (Claerbout, 1991). The format is made 

as simple as possible for maximum convenience, transparency and flexibility. 

According to the Unix tradition, common file formats should be in a readable 

textual form so that they can be easily examined and processed with universal tools. 

Raymond (2004) writes: 

To design a perfect anti-Unix, make all file formats binary and opaque, 

and require heavyweight tools to read and edit them. 

If you feel an urge to design a complex binary file format, or a complex 

binary application protocol, it is generally wise to lie down until the feeling 

passes. 

Storing large-scale datasets in a text format may not be economical. RSF chooses 

the next best thing: it allows data values to be stored in a binary format but puts all 

data attributes in text files that can be read by humans and processed with universal 

text-processing utilities. 

Example 

Let us first create some synthetic RSF data. 

bash$ sfmath n1=1000 output=’sin(0.5*x1)’ > sin.rsf 

Open and read the file sin.rsf. 


87


bash$ cat sin.rsf 

sfmath rsf/rsf/rsftour: fomels@egl Sun Jul 31 07:18:48 2005 

o1=0 

data_format="native_float" 

esize=4 

in="/tmp/sin.rsf@" 

x1=0 

d1=1 

n1=1000 

The file contains nine lines with simple readable text. The first line shows the name 

of the program, the working directory, the user and computer that created the file and 

the time it was created (that information is recorded for accounting purposes). Other 

lines contain parameter-value pairs separated by the “=” sign. The “in” parameter 

points to the location of the binary data. Before we discuss the meaning of parameters 

in more detail, let us plot the data. 

bash$ < sin.rsf 

sfwiggle title=’One Trace’ | sfpen 

On your screen, you should see a plot similar to Figure 1. 

Suppose you want to reformat the data so that instead of one trace of a thousand 

samples, it contains twenty traces with fifty samples each. Try running 

bash$ < sin.rsf sed ’s/n1=1000/n1=100 n2=10/’ > sin10.rsf 

bash$ < sin10.rsf sfwiggle title=Traces | sfpen 

or (using pipes) 

bash$ < sin.rsf sed ’s/n1=1000/n1=50 n2=20/’ | sfwiggle title=Traces | sfpen 

On your screen, you should see a plot similar to Figure 2. 

What happened? We used sed, a standard Unix line editing utility to change 

the parameters describing the data dimensions. Because of the simplicity of this 

operation, there is no need to create specialized data formatting tools or to make the 

sfwiggle program accept additional formatting parameters. Other general-purpose 

Unix tools that can be applied on RSF files include cat, echo, grep, etc. 

An alternative way to obtain the previous result is to run 

bash$ ( cat sin.rsf; echo n1=50 n2=20 ) > sin10.rsf 

bash$ < sin10.rsf sfwiggle title=Traces | sfpen

Madagascar Documentation RSF format 89 

Figure 1: An example sinusoid plot. rsf/format sin1


Figure 2: An example sinusoid plot, with data reformatted to twenty traces. 

rsf/format sin2


In this case, the cat utility simply copies the contents of the previous file, and the 

echo utility appends new line “n1=50 n2=20”. A new value of the n1 parameter 

overwrites the old value of n1=1000, and we achieve the same result as before. 

Of course, one could also edit the file by hand with one of the general purpose 

text editors. For recording the history of data processing, it is usually preferable to 

be able to process files with non-interactive tools. 

HEADER AND DATA FILES 

A simple way to check the layout of an RSF file is with the sfin program. 

bash$ sfin sin10.rsf 

sin10.rsf: 



n1=50 d1=1 o1=0 

n2=20 d2=? o2=? 


The program reports the following information: the location of the data file (/tmp/sin.rsf), 

the element size (4 bytes), the element type (floating point), the element form (native), 

the hypercube dimensions (50 × 20), axis scaling (1 and unspecified), and axis 

origin (0 and unspecified). It also checks the total number of elements and bytes in 

the data file. 

Let us examine this information in detail. First, we can verify that the data file 

exists and contains the specified number of bytes: 

bash$ ls -l /tmp/sin.rsf@ 

-rw-r--r-- 1 sergey users 4000 2004-10-04 00:35 /tmp/sin.rsf@ 

4000 bytes in this file are required to store 50 × 20 floating-point 4-byte numbers in 

a binary form. Thus, the data file contains nothing but the raw data in a contiguous 

binary form. 

Datapath 

How did the RSF program (sfmath) decide where to put the data file? In the order 

of priority, the rules for selecting the data file name and the data file directory are as 

follows: 

1. Check out= parameter on the command line. The parameter specifies the output 

data file location explicitly.


2. Specify the path and the file name separately. 

• The rules for the path selection are: 

(a) Check datapath= parameter on the command line. The parameter 

specifies a string to prepend to the file name. The string may contain 

the file directory. 

(b) Check DATAPATH environmental variable. It has the same meaning as 

the parameter specified with datapath=. 

(c) Check for .datapath file in the current directory. The file may contain 

a line 

datapath=/path/to_file/ 

or 

machine_name datapath=/path/to_file/ 

if you indent to use different paths on different platforms. 

(d) Check for .datapath file in the user home directory. 

(e) Put the data file in the current directory (similar to datapath=./). 

• The rules for the filename selection are: 

(a) If the output RSF file is in the current directory, the name of the data 

file is made by appending . 

(b) If the output file is not in the current directory or if it is created 

temporarily by a program, the name is made by appending random 

characters to the name of the program and selected to be unique. 

Examples: 

• 

bash$ sfspike n1=10 out=test1 > spike.rsf 

bash$ grep in spike.rsf 

in="test1" 

• 

bash$ sfspike n1=10 datapath=/tmp/ > spike.rsf 



•


bash$ DATAPATH=/tmp/ sfspike n1=10 > spike.rsf 



• 

bash$ sfspike n1=10 datapath=/tmp/ > /tmp/spike.rsf 

bash$ grep in /tmp/spike.rsf 

in="/tmp/sfspikejcARVf" 

Packing header and data together 

While the header and data files are separated by default, it is also possible to pack 

them together into one file. To do that, specify the program’s “out” parameter as 

out=stdout. Example: 

bash$ sfspike n1=10 out=stdout > spike.rsf 


Binary file spike.rsf matches 


spike.rsf: 

in="stdin" 




bash$ ls -l spike.rsf 

-rw-r--r-- 1 sergey users 196 2004-11-10 21:39 spike.rsf 

If you examine the contents of spike.rsf, you will find that it starts with the text 

header information, followed by special symbols, followed by binary data. 

Packing headers and data together may not be a good idea for data processing 

but it works well for storing data: it is easier to move the packed file around than to 

move two different files (header and binary) together while remembering to preserve 

their connection. Packing header and data together is also the current mechanism 

used to push RSF files through Unix pipes. 

Type 

The data stored with RSF can have different types: character, unsigned character, 

integer, floating point, or complex. By default, single precision is used for numbers 

(int and float data types in the C programming language). The number of bytes 

required for represent these numbers may depend on the platform.


Form 

The data stored with RSF can also be in a different form: ASCII, native binary, and 

XDR binary. Native binary is often used by default. It is the binary format employed 

by the machine that is running the application. On Linux-running PC, the native 

binary format will typically correspond to the so-called little-endian byte ordering. 

On some other platform, it might be big-endian ordering. XDR is a binary format 

designed by Sun for exchanging files over network. It typically corresponds to bigendian 

byte ordering. It is more efficient to process RSF files in the native binary 

format but, if you intend to access data from different platforms, it might be a good 

idea to store the corresponding file in an XDR format. RSF also allows for an ASCII 

(plain text) form of data files. 

Conversion between different types and forms is accomplished with sfdd program. 

Here are some examples. First, let us create synthetic data. 

bash$ sfmath n1=10 output=’10*sin(0.5*x1)’ > sin.rsf 

bash$ sfin sin.rsf 

sin.rsf: 



n1=10 d1=1 o1=0 


bash$ < sin.rsf sfdisfil 

0: 0 4.794 8.415 9.975 9.093 

5: 5.985 1.411 -3.508 -7.568 -9.775 

Converting the data to the integer type: 

bash$ < sin.rsf sfdd type=int > isin.rsf 

bash$ sfin isin.rsf 

isin.rsf: 

in="/tmp/isin.rsf@" 

esize=4 type=int form=native 

n1=10 d1=1 o1=0 


bash$ < isin.rsf sfdisfil 

0: 0 4 8 9 9 5 1 -3 -7 -9 

Converting the data to the ASCII form: 

bash$ < sin.rsf sfdd form=ascii > asin.rsf 

bash$ < asin.rsf sfdisfil 

0: 0 4.794 8.415 9.975 9.093


5: 5.985 1.411 -3.508 -7.568 -9.775 

bash$ sfin asin.rsf 

asin.rsf: 

in="/tmp/asin.rsf@" 

esize=0 type=float form=ascii 

n1=10 d1=1 o1=0 

10 elements 

bash$ cat /tmp/asin.rsf@ 

0 4.79426 8.41471 9.97495 9.09297 5.98472 1.4112 -3.50783 

-7.56803 -9.7753 

Hypercube 

While RSF stores binary data in a contiguous 1-D array, the conceptual data model 

is a multidimensional hypercube. By convention, the dimensions of the cube are 

defined with parameters n1, n2, n3, etc. The fastest axis is n1. Additionally, the 

grid sampling can be given by parameters d1, d2, d3, etc. The axes origins are given 

by parameters o1, o2, o3, etc. Optionally, you can also supply the axis label strings 

label1, label2, label3, etc., and axis units strings unit1, unit2, unit3, etc. 

COMPATIBILITY WITH OTHER FILE FORMATS 

It is possible to exchange RSF-formatted data with other popular data formats. 

Compatibility with SEPlib 

RSF is mostly compatible with its predecessor, the SEPlib file format. However, there 

are several important differences: 

1. SEPlib program typically use the element size (esize= parameter) to distinguish 

between different data types: esize=4 corresponds to floating point data, while 

esize=8 corresponds to complex data. The typical type handling mechanism 

in RSF is different: RSF looks at data format= to determine the data type. 

2. The default data form in SEPlib programs is typically XDR and not native as 

it is in RSF. 

3. It is possible to pipe the output of RSF programs to SEPlib: 

bash$ sfspike n1=1 | Attr want=min 

minimum value = 1 at 1


However, piping the output of SEPlib programs to RSF (or, for that matter, 

any other non-SEPlib programs) will result in an unterminated process. Do not 

try 

bash$ Spike n1=1 | sfattr want=ming 

That happens because SEPlib uses sockets for piping and expects a socket 

connection from the receiving program. RSF passes data through regular Unix 

pipes. 

4. SEP3D is an extension of SEPlib for operating with irregularly sampled data 

(Biondi et al., 1996). There is no equivalent of it in RSF for the reasons explained 

in the beginning of this guide. Operations with irregular datasets are 

supported through the use of auxiliary input files that represent the geometry 

information. 

Reading and writing SEG-Y and SU files 

The SEG-Y format is based on the proposal of Barry et al. (1975). It was revised in 

2002 2 . The SU format is a modification of SEG-Y used in Seismic Unix (Stockwell, 

1997). 

To convert files from SEG-Y or SU format to RSF, use the sfsegyread program. 

Let us first manufacture an example file using SU utilities (Stockwell, 1999): 

bash$ suplane > plane.su 

bash$ segyhdrs < plane.su | segywrite tape=plane.segy 

To convert it to RSF, use either 

bash$ sfsuread < plane.su tfile=tfile.rsf endian=0 > plane.rsf 

or 

bash$ sfsegyread < plane.segy tfile=tfile.rsf \ 

hfile=hfile bfile=bfile endian=0 > plane.rsf 

The endian flag is needed if the SU file originated from a little-endian machine such 

as Linux PC. 

Several files are generated. The standard output contains an RSF file with the 

data (32 traces with 64 samples each): 

2 See http://seg.org/publications/tech-stand/seg_y_rev1.pdf.


bash$ sfin plane.rsf 

plane.rsf: 

in="/tmp/plane.rsf@" 


n1=64 d1=0.004 o1=0 

n2=32 d2=? o2=? 


The contents of this file are displayed in Figure 3. The tfile is an RSF integer-type 

file with the trace headers (32 headers with 71 traces each): 

bash$ sfin tfile.rsf 

tfile.rsf: 

in="/tmp/tfile.rsf@" 

esize=4 type=int form=native 

n1=71 d1=? o1=? 

n2=32 d2=? o2=? 


The contents of trace headers can be quickly examined with the sfheaderattr program. 

The hfile is the ASCII header file for the whole record. 

bash$ head -c 242 hfile 

C This tape was made at the 

C 

C Center for Wave Phenomena 

The bfile is the binary header file. 

To convert files back from RSF to SEG-Y or SU, use the sfsegywrite program 

and reverse the input and output: 

bash$ sfsuwrite > plane.su tfile=tfile.rsf endian=0 < plane.rsf 

or 

bash$ sfsegywrite > plane.segy tfile=tfile.rsf \ 

hfile=hfile bfile=bfile endian=0 < plane.rsf 

If hfile= and bfile= are not supplied to sfsegywrite, the corresponding headers 

will be either picked from the default locations (files named header and binary) or 

generated on the fly. The trace header file can be generated with sfsegyheader. 

Here is an example:


Figure 3: The output of suplane, converted to RSF and displayed with sfwiggle. 

rsf/format plane


bash$ rm header binary 

bash$ sfheadermath < plane.rsf output=N+1 | sfdd type=int > tracl.rsf 

bash$ sfsegyheader < plane.rsf tracl=tracl.rsf > tfile.rsf 

bash$ sfsegywrite < plane.rsf tfile=tfile.rsf > plane.segy 

Reading and writing ASCII files 

Reading and writing ASCII files can be accomplished with the sfdd program. For 

example, let us take an ASCII file with numbers 

bash$ cat file.asc 

1.0 1.5 3.0 

4.8 9.1 7.3 

Converting it to RSF is as simple as 

bash$ echo in=file.asc n1=3 n2=2 data_format=ascii_float > file.rsf 

bash$ sfin file.rsf 

file.rsf: 

in="file.asc" 

esize=0 type=float form=ascii 

n1=3 d1=? o1=? 

n2=2 d2=? o2=? 

6 elements 

For more efficient input/output operations, it might be advantageous to convert the 

data type to native binary, as follows: 

bash$ echo in=file.asc n1=3 n2=2 data_format=ascii_float | \ 

sfdd form=native > file.rsf 

bash$ sfin file.rsf 

file.rsf: 

in="/tmp/file.rsf@" 


n1=3 d1=? o1=? 

n2=2 d2=? o2=? 


Convert from RSF to ASCII is equally simple: 

bash$ sfdd form=ascii out=file.asc < file.rsf > /dev/null 


1 1.5 3 4.8 9.1 7.3


You can use the line= and format= parameters in sfdd to control the ASCII formatting: 

bash$ sfdd form=ascii out=file.asc \ 

line=3 format="%3.1f " < file.rsf > /dev/null 


1.0 1.5 3.0 

4.8 9.1 7.3 

An alternative is to use sfdisfil. 

bash$ sfdisfil > file.asc col=3 format="%3.1f " number=n < file.rsf 


1.0 1.5 3.0 

4.8 9.1 7.3 

OTHER DOCUMENTATION 

This note should give you a general understanding of the RSF file format. Other 

relevant documentation is 

• Introduction to RSF 

• Installation instructions 

• Self-documentation reference for RSF programs 

• A guide to RSF programs 

• A guide to RSF programming interface 

• A guide to programming with RSF 

• A tour of RSF software 

• A guide to SCons interface for reproducible computations 

REFERENCES 

Barry, K. M., D. A. Cavers, and C. W. Kneale, 1975, Report on recommended standards 

for digital tape formats: Geophysics, 40, 344–352. 

Biondi, B., R. Clapp, and S. Crawley, 1996, Seplib90: Seplib for 3-D prestack data, 

in SEP-92: Stanford Exploration Project, 343–364. 

Claerbout, J. F., 1991, Introduction to Seplib and SEP utility software, in SEP-70: 

Stanford Exploration Project, 413–436.


Raymond, E. S., 2004, The art of UNIX programming: Addison-Wesley. 

Stockwell, J. W., 1997, Free software in education: A case study of CWP/SU: Seismic 

Unix: The Leading Edge, 16, 1045–1049. 

——–, 1999, The CWP/SU: Seismic Un*x package: Computers and Geosciences, 25, 

415–419.



Revisiting SEP tour with Madagascar and SCons 


ABSTRACT 

Many appreciative users were introduced to SEPlib (Claerbout, 1991) by an excellent 

article of Dellinger and Tálas (1992). In this paper, I show how to create 

a similar experience using Madagascar and SCons. 

GETTING STARTED 

Similarly to SU and SEPlib, RSF programs can be piped and executed from the 

command line, for example: 

bash$ sfspike n1=1000 k1=300 title="\s200 Welcome to \c2 RSF" | \ 

sfbandpass fhi=2 phase=1 | sfwiggle | sfpen 

If you are already familiar with SEPlib, you can find most of the familiar programs 

with the names prepended by “sf”. 

Typing a command without arguments, should produce a concise self-documentation. 

bash$ sfbandpass 

The recommended way of using RSF, however, is not with the command line but 

with SCons and “SConstruct” files. 

Setting up 

Open a file named “SConstruct” in your favorite editor and start it with a line 


This line tells Python to load the RSF project module. 


103


Obtaining the test data 

Add a Fetch command as follows: 

11 Fetch ( ’Txx .HH’ , ’ septour ’ ) 

Now, by running 

bash$ scons Txx.HH 

you can instruct SCons to connect to an anonymous data server and extract (fetch) 

the data file “Txx.HH” from the “septour” directory. 

Displaying the data 

Add the following line to the SConstruct file: 

17 Result ( ’ wiggle0 ’ , ’Txx .HH’ , ’ wiggle ’ ) 

Note that it does not matter if this line appears before or after the “Fetch” line. 

You are simply instructing SCons how to create a result plot from the input. 

Run 

bash$ scons wiggle0.view 

If everything is setup correctly in your environment, you should see something like 

the following output in your terminal: 

bash$ scons wiggle0.view 

scons: Reading SConscript files ... 

scons: done reading SConscript files. 

scons: Building targets ... 

retrieve(["Txx.HH"], []) 

< Txx.HH /path/to/RSF/bin/sfwiggle > Fig/wiggle0.vpl 

/path/to/RSF/bin/sfpen Fig/wiggle0.vpl 

and a figure similar to Figure 1 appearing on your screen. 

Windowing and plotting 

PROCESSING EXERCISES 

Our next task is to window and plot a significant portion of the data. 

following line to the SConstruct file: 

Add the

Madagascar Documentation RSF tour 105 

Figure 1: To see this figure on your screen, run scons wiggle0.view 

rsf/rsftour wiggle0


23 Flow ( ’ windowed ’ , ’Txx .HH’ , ’ window n2=10 min1=0.4 max1=0.8 ’ ) 

The window command selects the first ten traces and the time window between 

0.4 and 0.8 seconds. 

We will plot the windowed data with three different plotting programs. 

25 p l o t p a r = ’ ’ ’ 

26 transp=y poly=y y r e v e r s e=y p c l i p =100 nc=20 a l l p o s=n 

27 unit2=km unit1=s l a b e l 1=Time l a b e l 2=O f f s e t 

28 ’ ’ ’ 

29 

30 for p l o t in ( ’ wiggle ’ , ’ contour ’ , ’ grey ’ ) : 

For convenience, plotting parameters are put in a string called plotpar. A Python 

string can be enclosed in single, double, or triple quotes. Triple quotes allow the 

string to span multiple lines. In this case, we use triple quotes for convenience. Next, 

we loop (using Python’s for construct) through three different programs (wiggle, 

contour, and grey). For each program, the command portion of Result is formed 

by concatenating two strings with Python’s addition operator. 

Try running scons -Q wiggle.view. You should see something like the following 

output in your terminal: 

bash$ scons -Q wiggle.view 

< Txx.HH /path/to/RSF/bin/sfwindow n2=10 n1=200 f1=200 > windowed.rsf 

< windowed.rsf /path/to/RSF/bin/sfwiggle transp=y poly=y yreverse=y 

pclip=100 nc=200 > Fig/wiggle.vpl 

/path/to/RSF/bin/sfpen Fig/wiggle.vpl 

and a figure similar to Figure 2 appearing on your screen. The -Q switch tells SCons 

to run in a quiet mode, suppressing verbose comments. We will use it from now on 

to save space. You can dismiss the figure by using the “q” key on the keyboard or by 

hitting the “quit” button. 

Run scons -Q view, and you should see simply 

bash$ scons -Q view 

/path/to/RSF/bin/sfpen Fig/wiggle.vpl 

Since the wiggle.vpl figure is up to date, SCons does not rebuild it. After quitting 

the figure, SCons will resume processing with 

< windowed.rsf /path/to/RSF/bin/sfcontour transp=y poly=y yreverse=y 

pclip=100 nc=200 > Fig/contour.vpl 

/path/to/RSF/bin/sfpen Fig/contour.vpl


and a figure similar to Figure 3 appearing on your screen. Quitting the figure, produces 

< windowed.rsf /path/to/RSF/bin/sfgrey transp=y poly=y yreverse=y 

pclip=100 nc=200 > Fig/grey.vpl 

/path/to/RSF/bin/sfpen Fig/grey.vpl 

and Figure 4. 

Figure 2: To see this figure on your screen, run scons wiggle.view 

rsf/rsftour wiggle 

Resampling 

The next example demonstrated simple signal processing using the Fast Fourier Transform. 

We will first subsample the original data and then recover the data using Fourier 

interpolation.


Figure 3: To see this figure on your screen, run scons contour.view 

rsf/rsftour contour


Figure 4: To see this figure on your screen, run scons grey.view rsf/rsftour grey


Subsampling is accomplished with sfwindow. 

36 

37 # decimate time a x i s by two 

Running scons -Q subsampled.rsf produces 

< windowed.rsf /path/to/RSF/bin/sfwindow j1=2 > subsampled.rsf 

We can verify that the size of the first axis has decreased by running 

sfin windowed.rsf subsampled.rsf. 

Try also sfwiggle < subsampled.rsf | sfpen to quickly inspect the subsampled 

data on the screen. 

To interpolate the data back to the original sampling, the following sequence of 

steps can be applied: 

1. Fourier transform from time domain to frequency domain. 

2. Pad the frequency axis 

3. Inverse Fourier transform from frequency to time. 

All three steps are conveniently combined into one using pipes. 

39 

40 # s i n c i n t e r p o l a t i o n in the Fourier domain 

41 Flow ( ’ resampled ’ , ’ subsampled ’ , 

Why do we pad the Fourier domain to 102? The time length of the original data 

is 201 samples. In the frequency domain, it can be represented with 101 positive 

frequencies plus the zero frequency, which amounts to 102. Note that the output of 

sffft1 does not contain negative frequencies. 

Finally, we display the result. The reconstructed data is shown in Figure 5. 

Comparing this result with Figure 2, we can verify a fairly accurate reconstruction. 

As an exercise, try subsampling the data by a factor of 4 and see if you can still 

reconstruct the original data with the Fourier method.


Figure 5: To see this figure on your screen, run scons resampled.view 

rsf/rsftour resampled


Normal Moveout 

The next example applies a simple constant-velocity NMO correction to the windowed 

data and pipes the result to a wiggle plotting command: 

49 

50 Result ( ’nmo ’ , ’ windowed ’ , 

51 ’ ’ ’ 

52 nmostretch v0 =2.05 h a l f=n | 

53 wiggle p c l i p =100 max1=0.6 poly=y 

Running scons -Q nmo.view produces 

< windowed.rsf /path/to/RSF/bin/sfnmostretch v0=2.05 half=n | 

/path/to/RSF/bin/sfwiggle pclip=100 max1=0.6 poly=y > Fig/nmo.vpl 

/path/to/RSF/bin/sfpen Fig/nmo.vpl 

and Figure 6. Note that SCons does not recreate the windowed.rsf file if that file is 

up to date. You can experiment with the NMO velocity (2.05 km/s) or with plotting 

parameters to get different results. As Dellinger and Tálas (1992) point out, the 

NMO velocity of 2.05 km/s “appears to split the difference between two distinctly 

non-hyperbolic shear waves”. 

Advanced plotting 

Sometimes, we need to combine different plots either by overlaying them on top of 

each other or by putting them side by side. Here is an example of accomplishing it 

with RSF and SCons. 

Start by creating common plotting plotting arguments and plotting the data in 

greyscale. 

59 

60 p l o t p a r = p l o t p a r+’ min1=.4 max1=.8 max2=1. min2=.05 poly=n ’ 

61 

62 Plot ( ’ grey ’ , ’ windowed ’ , 

Next, plot the wiggle traces twice: the fist time, using thick black lines (plotcol=0 

plotfat=10), and the second time, using thinner white lines (plotcol=7 plotfat=5). 

63 ’ grey w h e r e t i t l e=t wherexlabel=b ’ + p l o t p a r ) 

64 Plot ( ’ wiggle1 ’ , ’ windowed ’ , 

65 ’ wiggle p l o t c o l =0 p l o t f a t =10 ’ + p l o t p a r ) 

66 Plot ( ’ wiggle2 ’ , ’ windowed ’ ,


Figure 6: To see this figure on your screen, run scons nmo.view rsf/rsftour nmo


The plots are combined by overlaying or by putting them side by side. 

68 

69 Result ( ’ o v e r p l o t ’ , ’ grey wiggle1 wiggle2 ’ , ’ Overlay ’ ) 

The resultant plots are shown in Figures 7 and 8. 

Figure 7: To see this figure on your screen, run scons overplot.view 

rsf/rsftour overplot 

CONCLUSIONS 

This tour is not designed as a comprehensive manual. It simply gives a glimpse into 

working in a reproducible research environment with RSF and SCons. The reader is 

encouraged to experiment with the SConstruct file attached to this tour and included 

in the Appendix. For other documentation on RSF, please see 

• Introduction to RSF


Figure 8: To see this figure on your screen, run scons sidebyside.view 

rsf/rsftour sidebyside 

• Installation instructions 

• Self-documentation reference for RSF programs 

• A guide to RSF programs 

• A guide to RSF file format 

• A guide to RSF programming interface 

• A guide to programming with RSF 

• A guide to SCons interface for reproducible computations 

ACKNOWLEDGMENTS 

Thanks to Joe Dellinger and Sándor Tálas for creating “SEP tour” and to James 

Rickett for updating it. Several generations of SEP students contributed to SEPlib. 

We try to preserve all their good ideas when refactoring SEPlib into RSF. 

The test dataset used in this paper is courtesy of Beltram Nolte and L. Neil Frazer. 

REFERENCES 

Claerbout, J. F., 1991, Introduction to Seplib and SEP utility software, in SEP-70: 

Stanford Exploration Project, 413–436. 

Dellinger, J., and S. Tálas, 1992, A tour of SEPlib for new users, in SEP-73: Stanford 

Exploration Project, 461–502.


SCONSTRUCT FILE 

Here is a complete listing of the SConstruct file used in this example. 

1 ######################################################### 

2 # S e t t i n g up 

3 ######################################################### 

4 


6 

7 ######################################################### 

8 # Obtaining the t e s t data 

9 ######################################################### 

10 

11 Fetch ( ’Txx .HH’ , ’ septour ’ ) 

12 

13 ######################################################### 

14 # D i s p l a y i n g the data 

15 ######################################################### 

16 

17 Result ( ’ wiggle0 ’ , ’Txx .HH’ , ’ wiggle ’ ) 

18 

19 ######################################################### 

20 # Windowing and p l o t t i n g 

21 ######################################################### 

22 

23 Flow ( ’ windowed ’ , ’Txx .HH’ , ’ window n2=10 min1=0.4 max1=0.8 ’ ) 

24 

25 p l o t p a r = ’ ’ ’ 

26 transp=y poly=y y r e v e r s e=y p c l i p =100 nc=20 a l l p o s=n 

27 unit2=km unit1=s l a b e l 1=Time l a b e l 2=O f f s e t 

28 ’ ’ ’ 

29 

30 for p l o t in ( ’ wiggle ’ , ’ contour ’ , ’ grey ’ ) : 

31 Result ( plot , ’ windowed ’ , p l o t + p l o t p a r ) 

32 

33 ######################################################### 

34 # Resampling 

35 ######################################################### 

36 

37 # decimate time a x i s by two 

38 Flow ( ’ subsampled ’ , ’ windowed ’ , ’ window j 1=2 ’ ) 

39 

40 # s i n c i n t e r p o l a t i o n in the Fourier domain 

41 Flow ( ’ resampled ’ , ’ subsampled ’ ,


42 ’ f f t 1 | pad n1=102 | f f t 1 inv=y opt=n | window max1=0.8 ’ ) 

43 

44 Result ( ’ resampled ’ , ’ wiggle t i t l e=Resampled ’ + p l o t p a r ) 

45 

46 ######################################################### 

47 # V e l o c i t y a n a l y s i s and NMO 

48 ######################################################### 

49 

50 Result ( ’nmo ’ , ’ windowed ’ , 

51 ’ ’ ’ 

52 nmostretch v0 =2.05 h a l f=n | 

53 wiggle p c l i p =100 max1=0.6 poly=y 

54 ’ ’ ’ ) 

55 

56 ######################################################### 

57 # Advanced p l o t t i n g 

58 ######################################################### 

59 

60 p l o t p a r = p l o t p a r+’ min1=.4 max1=.8 max2=1. min2=.05 poly=n ’ 

61 

62 Plot ( ’ grey ’ , ’ windowed ’ , 

63 ’ grey w h e r e t i t l e=t wherexlabel=b ’ + p l o t p a r ) 





68 

69 Result ( ’ o v e r p l o t ’ , ’ grey wiggle1 wiggle2 ’ , ’ Overlay ’ ) 

70 Result ( ’ s i d e b y s i d e ’ , ’ grey wiggle2 ’ , ’ SideBySideIso ’ ) 

71 

72 ######################################################### 

73 # Wrapping up 

74 ######################################################### 

75 

76 End ( )



Guide to RSF API 


ABSTRACT 

This guide explains the RSF programming interface. 

INTRODUCTION 

To work with RSF files in your own programs, you may need to use an appropriate 

programming interface. We will demonstrate the interface in different languages using 

a simple example. The example is a clipping program. It reads and writes RSF files 

and accesses parameters both from the input file and the command line. The input 

is processed trace by trace. This is not necessarily the most efficient approach 2 but 

it suffices for a simple demonstration. 

The C clip function is listed below. 

1 /∗ Clip the data . ∗/ 

2 

3 #include 

4 

C INTERFACE 

5 int main ( int argc , char∗ argv [ ] ) 

6 { 

7 int n1 , n2 , i1 , i 2 ; 

8 float c l i p , ∗ t r a c e ; 

9 s f f i l e in , out ; /∗ Input and output f i l e s ∗/ 

10 

11 /∗ I n i t i a l i z e RSF ∗/ 


13 /∗ standard input ∗/ 

14 in = s f i n p u t ( ” in ” ) ; 

15 /∗ standard output ∗/ 


17 


2 Compare with the library clip program. 

119


18 /∗ check t h a t the input i s f l o a t ∗/ 

19 i f (SF FLOAT != s f g e t t y p e ( in ) ) 

20 s f e r r o r ( ”Need f l o a t input ” ) ; 

21 

22 /∗ n1 i s the f a s t e s t dimension ( t r a c e l e n g t h ) ∗/ 

23 i f ( ! s f h i s t i n t ( in , ”n1”,&n1 ) ) 


25 /∗ l e f t s i z e g e t s n2∗n3∗n4 ∗ . . . ( the number of t r a c e s ) ∗/ 

26 n2 = s f l e f t s i z e ( in , 1 ) ; 

27 

28 /∗ parameter from the command l i n e ( i . e . c l i p =1.5 ) ∗/ 

29 i f ( ! s f g e t f l o a t ( ” c l i p ”,& c l i p ) ) s f e r r o r ( ”Need c l i p=” ) ; 

30 

31 /∗ a l l o c a t e f l o a t i n g p o i n t array ∗/ 


33 


35 for ( i 2 =0; i 2 < n2 ; i 2++) { 

36 

37 /∗ read a t r a c e ∗/ 

38 s f f l o a t r e a d ( trace , n1 , in ) ; 

39 


41 for ( i 1 =0; i 1 < n1 ; i 1++) { 

42 i f ( t r a c e [ i 1 ] > c l i p ) t r a c e [ i 1 ]= c l i p ; 

43 else i f ( t r a c e [ i 1 ] < −c l i p ) t r a c e [ i 1 ]=− c l i p ; 

44 } 

45 

46 /∗ w r i t e a t r a c e ∗/ 

47 s f f l o a t w r i t e ( trace , n1 , out ) ; 

48 } 

49 

50 

51 e x i t ( 0 ) ; 

52 } 

Let us examine it in detail. 

3 #include 

The include preprocessing directive is required to access the RSF interface. 

9 s f f i l e in , out ; /∗ Input and output f i l e s ∗/ 

RSF data files are defined with an abstract sf file data type. An abstract data 

type means that the contents of it are not publicly declared, and all operations on

Madagascar Documentation RSF API 121 

sf file objects should be performed with library functions. This is analogous to 

FILE * data type used in stdio.h and as close as C gets to an object-oriented style 

of programming (Roberts, 1998). 

11 /∗ I n i t i a l i z e RSF ∗/ 


Before using any of the other functions, you must call sf init. This function parses 

the command line and initializes an internally stored table of command-line parameters. 

13 /∗ standard input ∗/ 

14 in = s f i n p u t ( ” in ” ) ; 

15 /∗ standard output ∗/ 


The input and output RSF file objects are created with sf input and sf output 

constructor functions. Both these functions take a string argument. The string 

may refer to a file name or a file tag. For example, if the command line contains 

vel=velocity.rsf, then both sf input("velocity.rsf") and sf input("vel") 

are acceptable. Two tags are special: "in" refers to the file in the standard input 

and "out" refers to the file in the standard output. 

18 /∗ check t h a t the input i s f l o a t ∗/ 

19 i f (SF FLOAT != s f g e t t y p e ( in ) ) 

20 s f e r r o r ( ”Need f l o a t input ” ) ; 

RSF files can store data of different types (character, integer, floating point, complex). 

We extract the data type of the input file with the library sf gettype function and 

check if it represents floating point numbers. If not, the program is aborted with an 

error message, using the sf error function. It is generally a good idea to check the 

input for user errors and, if they cannot be corrected, to take a safe exit. 

22 /∗ n1 i s the f a s t e s t dimension ( t r a c e l e n g t h ) ∗/ 

23 i f ( ! s f h i s t i n t ( in , ”n1”,&n1 ) ) 


25 /∗ l e f t s i z e g e t s n2∗n3∗n4 ∗ . . . ( the number of t r a c e s ) ∗/ 

26 n2 = s f l e f t s i z e ( in , 1 ) ; 

Conceptually, the RSF data model is a multidimensional hypercube. By convention, 

the dimensions of the cube are stored in n1=, n2=, etc. parameters. The n1 parameter 

refers to the fastest axis. If the input dataset is a collection of traces, n1 refers to the 

trace length. We extract it using the sf histint function (integer parameter from 

history) and abort if no value for n1 is found. We could proceed in a similar fashion, 

extracting n2, n3, etc. If we are interested in the total number of traces, like in the clip 

example, a shortcut is to use the sf leftsize function. Calling sf leftsize(in,0)


returns the total number of elements in the hypercube (the product of n1, n2, etc.), 

calling sf leftsize(in,1) returns the number of traces (the product of n2, n3, 

etc.), calling sf leftsize(in,2) returns the product of n3, n4, etc. By calling 

sf leftsize, we avoid the need to extract additional parameters for the hypercube 

dimensions that we are not interested in. 

28 /∗ parameter from the command l i n e ( i . e . c l i p =1.5 ) ∗/ 

29 i f ( ! s f g e t f l o a t ( ” c l i p ”,& c l i p ) ) s f e r r o r ( ”Need c l i p=” ) ; 

The clip parameter is read from the command line, where it can be specified, for 

example, as clip=10. The parameter has the float type, therefore we read it with 

the sf getfloat function. If no clip= parameter is found among the command 

line arguments, the program is aborted with an error message using the sf error 

function. 

31 /∗ a l l o c a t e f l o a t i n g p o i n t array ∗/ 


Next, we allocate an array of floating-point numbers to store a trace with the library 

sf floatalloc function. Unlike the standard malloc the RSF allocation function 

checks for errors and either terminates the program or returns a valid pointer. 


35 for ( i 2 =0; i 2 < n2 ; i 2++) { 

36 

37 /∗ read a t r a c e ∗/ 

38 s f f l o a t r e a d ( trace , n1 , in ) ; 

39 


41 for ( i 1 =0; i 1 < n1 ; i 1++) { 



44 } 

45 

46 /∗ w r i t e a t r a c e ∗/ 

47 s f f l o a t w r i t e ( trace , n1 , out ) ; 

48 } 

The rest of the program is straightforward. We loop over all available traces, read 

each trace, clip it and right the output out. The syntax of sf floatread and 

sf floatwrite functions is similar to the syntax of the C standard fread and fwrite 

function except that the type of the element is specified explicitly in the function name 

and that the input and output files have the RSF type sf file.


Compiling 

To compile the clip program, run 

cc clip.c -I$RSFROOT/include -L$RSFROOT/lib -lrsf -lm 

Change cc to the C compiler appropriate for your system and include additional 

compiler flags if necessary. The flags that RSF typically uses are in 

$RSFROOT/share/madagascar/etc/config.py. 

The C++ clip function is listed below. 

1 /∗ Clip the data . ∗/ 

2 

3 #include 

4 #include 

5 

C++ INTERFACE 


7 { 

8 s f i n i t ( argc , argv ) ; // I n i t i a l i z e RSF 

9 

10 iRSF par ( 0 ) , in ; // input parameter , f i l e 

11 oRSF out ; // output f i l e 

12 

13 int n1 , n2 ; // t r a c e length , number of t r a c e s 

14 float c l i p ; 

15 

16 in . get ( ”n1” , n1 ) ; 

17 n2=in . s i z e ( 1 ) ; 

18 

19 par . get ( ” c l i p ” , c l i p ) ; // parameter from the command l i n e 

20 

21 std : : valarray t r a c e ( n1 ) ; 

22 

23 for ( int i 2 =0; i 2 < n2 ; i 2++) { // loop over t r a c e s 

24 in >> t r a c e ; // read a t r a c e 

25 

26 for ( int i 1 =0; i 1 < n1 ; i 1++) { // loop over samples 



29 } 

30


31 out > t r a c e ; // read a t r a c e 

25 

26 for ( int i 1 =0; i 1 < n1 ; i 1++) { // loop over samples 

27 i f ( t r a c e [ i 1 ] > c l i p ) t r a c e [ i 1 ]= c l i p ;



29 } 

30 

31 out 1000) then 

18 c a l l s f e r r o r ( ”n1 i s too long ” ) 

19 end i f 

20 n2 = s f l e f t s i z e ( in , 1 )


21 

22 i f ( . not . s f g e t f l o a t ( ” c l i p ” , c l i p ) ) 

23 & c a l l s f e r r o r ( ”Need c l i p=” ) 

24 

25 do 10 i 2 =1, n2 

26 c a l l s f f l o a t r e a d ( trace , n1 , in ) 

27 

28 do 20 i 1 =1, n1 

29 i f ( t r a c e ( i 1 ) > c l i p ) then 

30 t r a c e ( i 1 )= c l i p 

31 else i f ( t r a c e ( i 1 ) < −c l i p ) then 

32 t r a c e ( i 1)=− c l i p 

33 end i f 

34 20 continue 

35 

36 c a l l s f f l o a t w r i t e ( trace , n1 , out ) 


38 

39 stop 

40 end 


8 c a l l s f i n i t ( ) 

The program starts with a call to sf init, which initializes the command-line interface. 

9 in = s f i n p u t ( ” in ” ) 

10 out = s f o u t p u t ( ” out ” ) 

The input and output files are created with calls to sf input and sf output. Because 

of the absence of derived types in Fortran-77, we use simple integer pointers to 

represent RSF files. Both sf input and sf output accept a character string, which 

may refer to a file name or a file tag. For example, if the command line contains 

vel=velocity.rsf, then both sf input("velocity.rsf") and sf input("vel") 

are acceptable. Two tags are special: "in" refers to the file in the standard input 

and "out" refers to the file in the standard output. 

12 i f (3 . ne . s f g e t t y p e ( in ) ) 

13 & c a l l s f e r r o r ( ”Need f l o a t input ” ) 

RSF files can store data of different types (character, integer, floating point, complex). 

The function sf gettype checks the type of data stored in the RSF file. We make 

sure that the type corresponds to floating-point numbers. If not, the program is 

aborted with an error message, using the sf error function. It is generally a good


idea to check the input for user errors and, if they cannot be corrected, to take a safe 

exit. 

15 i f ( . not . s f h i s t i n t ( in , ”n1” , n1 ) ) then 

16 c a l l s f e r r o r ( ”No n1= in input ” ) 

17 else i f ( n1 > 1000) then 

18 c a l l s f e r r o r ( ”n1 i s too long ” ) 

19 end i f 

20 n2 = s f l e f t s i z e ( in , 1 ) 

Conceptually, the RSF data model is a multidimensional hypercube. By convention, 

the dimensions of the cube are stored in n1=, n2=, etc. parameters. The n1 parameter 

refers to the fastest axis. If the input dataset is a collection of traces, n1 refers to 

the trace length. We extract it using the sf histint function (integer parameter 

from history) and abort if no value for n1 is found. Since Fortran-77 cannot easily 

handle dynamic allocation, we also need to check that n1 is not larger than the size 

of the statically allocated array. We could proceed in a similar fashion, extracting n2, 

n3, etc. If we are interested in the total number of traces, like in the clip example, 

a shortcut is to use the sf leftsize function. Calling sf leftsize(in,0) returns 

the total number of elements in the hypercube (the product of n1, n2, etc.), calling 

sf leftsize(in,1) returns the number of traces (the product of n2, n3, etc.), calling 

sf leftsize(in,2) returns the product of n3, n4, etc. By calling sf leftsize, we 

avoid the need to extract additional parameters for the hypercube dimensions that 

we are not interested in. 

22 i f ( . not . s f g e t f l o a t ( ” c l i p ” , c l i p ) ) 

23 & c a l l s f e r r o r ( ”Need c l i p=” ) 


example, as clip=10. The parameter has the float type, therefore we read it with 

the sf getfloat function. If no clip= parameter is found among the command 

line arguments, the program is aborted with an error message using the sf error 

function. 

25 do 10 i 2 =1, n2 

26 c a l l s f f l o a t r e a d ( trace , n1 , in ) 

27 

28 do 20 i 1 =1, n1 

29 i f ( t r a c e ( i 1 ) > c l i p ) then 

30 t r a c e ( i 1 )= c l i p 

31 else i f ( t r a c e ( i 1 ) < −c l i p ) then 

32 t r a c e ( i 1)=− c l i p 

33 end i f 


35 

36 c a l l s f f l o a t w r i t e ( trace , n1 , out )



Finally, we do the actual work: loop over input traces, reading, clipping, and writing 

out each trace. 

Compiling 

To compile the Fortran-77 program, run 

f77 clip.f -L$RSFROOT/lib -lrsff -lrsf -lm 

Change f77 to the Fortran compiler appropriate for your system and include additional 



FORTRAN-90 INTERFACE 

The Fortran-90 clip function is listed below. 

1 program C l i p i t 

2 use r s f 

3 

4 implicit none 

5 type ( f i l e ) : : in , out 

6 integer : : n1 , n2 , i1 , i 2 

7 real : : c l i p 

8 real , dimension ( : ) , allocatable : : t r a c e 

9 

10 c a l l s f i n i t ( ) ! i n i t i a l i z e RSF 

11 in = r s f i n p u t ( ) 

12 out = r s f o u t p u t ( ) 

13 

14 i f ( s f f l o a t /= gettype ( in ) ) c a l l s f e r r o r ( ”Need f l o a t s ” ) 

15 

16 c a l l from par ( in , ”n1” , n1 ) 

17 n2 = f i l e s i z e ( in , 1 ) 

18 

19 c a l l from par ( ” c l i p ” , c l i p ) ! command−l i n e parameter 

20 

21 allocate ( t r a c e ( n1 ) ) 

22 

23 do i 2 =1, n2 ! loop over t r a c e s 

24 c a l l r s f r e a d ( in , t r a c e )


25 

26 where ( t r a c e > c l i p ) t r a c e = c l i p 

27 where ( t r a c e < −c l i p ) t r a c e = −c l i p 

28 

29 c a l l r s f w r i t e ( out , t r a c e ) 

30 end do 

31 end program C l i p i t 


2 use r s f 

The program starts with importing the rsf module. 

10 c a l l s f i n i t ( ) ! i n i t i a l i z e RSF 

A call to sf init is needed to initialize the command-line interface. 

11 in = r s f i n p u t ( ) 

12 out = r s f o u t p u t ( ) 

The standard input and output files are initialized with rsf input and rsf output 

functions. Both functions accept optional arguments. For example, if the command 

line contains vel=velocity.rsf, then both rsf input("velocity.rsf") and 

rsf input("vel") are acceptable. 

14 i f ( s f f l o a t /= gettype ( in ) ) c a l l s f e r r o r ( ”Need f l o a t s ” ) 

A call to from par extracts the “n1” parameter from the input file. Conceptually, 

the RSF data model is a multidimensional hypercube. The n1 parameter refers to 

the fastest axis. If the input dataset is a collection of traces, n1 corresponds to the 

trace length. We could proceed in a similar fashion, extracting n2, n3, etc. If we are 

interested in the total number of traces, like in the clip example, a shortcut is to use 

the filesize function. Calling filesize(in) returns the total number of elements 

in the hypercube (the product of n1, n2, etc.), calling filesize(in,1) returns the 

number of traces (the product of n2, n3, etc.), calling filesize(in,2) returns the 

product of n3, n4, etc. By calling filesize, we avoid the need to extract additional 

parameters for the hypercube dimensions that we are not interested in. 

17 n2 = f i l e s i z e ( in , 1 ) 


example, as clip=10. If we knew a good default value for clip, we could specify it 

with an optional argument, i.e. call from par("clip",clip,default). 

21 allocate ( t r a c e ( n1 ) ) 

22


23 do i 2 =1, n2 ! loop over t r a c e s 

24 c a l l r s f r e a d ( in , t r a c e ) 

25 

26 where ( t r a c e > c l i p ) t r a c e = c l i p 

27 where ( t r a c e < −c l i p ) t r a c e = −c l i p 



Compiling 

To compile the Fortran-90 program, run 

f90 clip.f90 -I$RSFROOT/include -L$RSFROOT/lib -lrsff90 -lrsf -lm 

Change f90 to the Fortran-90 compiler appropriate for your system and include additional 



The Python clip script is listed below. 

1 #!/ usr / bin /env python 

2 

3 import numpy 

4 import m8r 

5 

6 par = m8r . Par ( ) 

7 inp = m8r . Input ( ) 

8 output = m8r . Output ( ) 

9 a s s e r t ’ f l o a t ’ == inp . type 

10 

11 n1 = inp . i n t ( ”n1” ) 

12 n2 = inp . s i z e ( 1 ) 

13 a s s e r t n1 

14 

15 c l i p = par . f l o a t ( ” c l i p ” ) 

16 a s s e r t c l i p 

17 

18 t r a c e = numpy . z e r o s ( n1 , ’ f ’ ) 

19 

PYTHON INTERFACE 

20 for i 2 in xrange ( n2 ) : # loop over t r a c e s


21 inp . read ( t r a c e ) 

22 t r a c e = numpy . c l i p ( trace ,− c l i p , c l i p ) 

23 output . w r i t e ( t r a c e ) 


3 import numpy 

4 import m8r 

The script starts with importing the numpy and rsf modules. 

6 par = m8r . Par ( ) 

7 inp = m8r . Input ( ) 

8 output = m8r . Output ( ) 

9 a s s e r t ’ f l o a t ’ == inp . type 

Next, we initialize the command line interface and the standard input and output 

files. We also make sure that the input file type is floating point. 

11 n1 = inp . i n t ( ”n1” ) 

12 n2 = inp . s i z e ( 1 ) 

13 a s s e r t n1 

We extract the “n1” parameter from the input file. Conceptually, the RSF data 

model is a multidimensional hypercube. The n1 parameter refers to the fastest axis. 

If the input dataset is a collection of traces, n1 corresponds to the trace length. We 

could proceed in a similar fashion, extracting n2, n3, etc. If we are interested in the 

total number of traces, like in the clip example, a shortcut is to use the size method 

of the Input class1. Calling size(0) returns the total number of elements in the 

hypercube (the product of n1, n2, etc.), calling size(1) returns the number of traces 

(the product of n2, n3, etc.), calling size(2) returns the product of n3, n4, etc. 

15 c l i p = par . f l o a t ( ” c l i p ” ) 

16 a s s e r t c l i p 


example, as clip=10. 

20 for i 2 in xrange ( n2 ) : # loop over t r a c e s 

21 inp . read ( t r a c e ) 

22 t r a c e = numpy . c l i p ( trace ,− c l i p , c l i p ) 

23 output . w r i t e ( t r a c e ) 


out each trace.


Compiling 

The python script does not require compilation. Simply make sure to set PYTHONPATH 

and LD LIBRARY PATH according to 

$RSFROOT/etc/madagascar/env.sh or $RSFROOT/etc/madagascar/env.csh. 

MATLAB INTERFACE 

The MATLAB clip function is listed below. 

1 function c l i p ( in , out , c l i p ) 

2 %CLIP Clip the data 

3 

4 dims = r s f d i m ( in ) ; 

5 n1 = dims ( 1 ) ; % t r a c e l e n g t h 

6 n2 = prod( dims ( 2 : end ) ) ; % number of t r a c e s 

7 trace = 1 : n1 ; % a l l o c a t e t r a c e 

8 r s f c r e a t e ( out , in ) % c r e a t e an output f i l e 

9 

10 for i 2 = 1 : n2 % loop over t r a c e s 

11 r s f r e a d ( trace , in , ’ same ’ ) ; 

12 trace ( trace > c l i p ) = c l i p ; 

13 trace ( trace < − c l i p ) = −c l i p ; 

14 r s f w r i t e ( trace , out , ’ same ’ ) ; 

15 end 


4 dims = r s f d i m ( in ) ; 

We start by figuring out the input file dimensions. 

5 n1 = dims ( 1 ) ; % t r a c e l e n g t h 

6 n2 = prod( dims ( 2 : end ) ) ; % number of t r a c e s 

The first dimension is the trace length, the product of all other dimensions correspond 

to the number of traces. 

7 trace = 1 : n1 ; % a l l o c a t e t r a c e 

8 r s f c r e a t e ( out , in ) % c r e a t e an output f i l e 

Next, we allocate the trace array and create an output file. 

10 for i 2 = 1 : n2 % loop over t r a c e s 

11 r s f r e a d ( trace , in , ’ same ’ ) ; 

12 trace ( trace > c l i p ) = c l i p ;


13 trace ( trace < − c l i p ) = −c l i p ; 

14 r s f w r i t e ( trace , out , ’ same ’ ) ; 

15 end 



Compiling 

The MATLAB script does not require compilation. Simply make sure that $RSFROOT/lib 

is in MATLABPATH and LD LIBRARY PATH. 

INSTALLATION 

To install the interface to a particular language, use API= parameter in the RSF configuration. 

For example, to to install C++ and Fortran-90 API bindings in addition 

to the basic package, run 

scons API=c++,f90 config 

Only the C interface is configured by default. The configuration parameters are stored 

in 


REFERENCES 

Roberts, E. S., 1998, Programming abstractions in C: Addison-Wesley.



Guide to programming using RSF 

Paul Sava 1 

ABSTRACT 

This guide demonstrates a simple time-domain finite-differences modeling code 

in RSF. 

INTRODUCTION 

This section presents time-domain finite-difference modeling 2 written with the RSF 

library. The program is demonstrated with the C, C++ and Fortran 90 interfaces. 

The acoustic wave-equation 

∆U − 1 ∂ 2 U 

= f(t) (1) 

v 2 ∂t2 can be written as 

[∆U − f(t)] v 2 = ∂2 U 

∂t . (2) 

2 

∆ is the Laplacian symbol, f(t) is the source wavelet, v is the velocity, and U is a 

scalar wavefield. 

A discrete time-step involves the following computations: 

U i+1 = [∆U − f(t)] v 2 ∆t 2 + 2U i − U i−1 , (3) 

where U i−1 , U i and U i+1 represent the propagating wavefield at various time steps. 

1 e-mail: paul.sava@beg.utexas.edu 

2 “Hello world” of seismic imaging. 

135


C PROGRAM 

1 /∗ time−domain a c o u s t i c FD m o d e l i n g ∗/ 

2 #include < r s f . h> 


4 { 

5 /∗ L a p l a c i a n c o e f f i c i e n t s ∗/ 

6 f l o a t c0 = −30./12. , c1 =+16./12. , c2=− 1 . / 1 2 . ; 

7 

8 b o o l verb ; /∗ v e r b o s e f l a g ∗/ 

9 s f f i l e Fw=NULL, Fv=NULL, Fr=NULL, Fo=NULL; /∗ I /O f i l e s ∗/ 

10 s f a x i s at , az , ax ; /∗ c u b e a x e s ∗/ 

11 int i t , i z , i x ; /∗ i n d e x v a r i a b l e s ∗/ 

12 int nt , nz , nx ; 

13 f l o a t dt , dz , dx , idx , idz , dt2 ; 

14 

15 f l o a t ∗ww, ∗ ∗ vv , ∗ ∗ r r ; /∗ I /O a r r a y s ∗/ 

16 f l o a t ∗∗um, ∗ ∗ uo , ∗ ∗ up , ∗ ∗ ud ; /∗ tmp a r r a y s ∗/ 

17 


19 i f ( ! s f g e t b o o l ( ” verb ” ,& verb ) ) verb =0; /∗ v e r b o s e f l a g ∗/ 

20 

21 /∗ s e t u p I /O f i l e s ∗/ 

22 Fw = s f i n p u t ( ” i n ” ) ; 

23 Fo = s f o u t p u t ( ” out ” ) ; 

24 Fv = s f i n p u t ( ” v e l ” ) ; 

25 Fr = s f i n p u t ( ” r e f ” ) ; 

26 

27 /∗ Read / Write a x e s ∗/ 

28 at = s f i a x a (Fw , 1 ) ; nt = s f n ( at ) ; dt = s f d ( at ) ; 

29 az = s f i a x a ( Fv , 1 ) ; nz = s f n ( az ) ; dz = s f d ( az ) ; 

30 ax = s f i a x a ( Fv , 2 ) ; nx = s f n ( ax ) ; dx = s f d ( ax ) ; 

31 

32 s f o a x a ( Fo , az , 1 ) ; 

33 s f o a x a ( Fo , ax , 2 ) ; 

34 s f o a x a ( Fo , at , 3 ) ; 

35 

36 dt2 = dt ∗ dt ; 

37 i d z = 1/( dz ∗ dz ) ; 

38 i d x = 1/( dx∗dx ) ; 

39 

40 /∗ r e a d w a v e l e t , v e l o c i t y & r e f l e c t i v i t y ∗/ 

41 ww= s f f l o a t a l l o c ( nt ) ; s f f l o a t r e a d (ww , nt ,Fw ) ; 

42 vv= s f f l o a t a l l o c 2 ( nz , nx ) ; s f f l o a t r e a d ( vv [ 0 ] , nz ∗nx , Fv ) ; 

43 r r= s f f l o a t a l l o c 2 ( nz , nx ) ; s f f l o a t r e a d ( r r [ 0 ] , nz ∗nx , Fr ) ; 

44 

45 /∗ a l l o c a t e t e m p o r a r y a r r a y s ∗/ 

46 um= s f f l o a t a l l o c 2 ( nz , nx ) ; 

47 uo= s f f l o a t a l l o c 2 ( nz , nx ) ; 

48 up= s f f l o a t a l l o c 2 ( nz , nx ) ; 

49 ud= s f f l o a t a l l o c 2 ( nz , nx ) ; 

50 

51 f o r ( i x =0; ix

Madagascar Documentation RSF DEMO 137 

90 

91 /∗ t i m e s t e p ∗/ 

92 f o r ( i z =0; i z


• Compute Laplacian: ∆U. 

66 f o r ( i z =2; i z


C++ PROGRAM 

1 // time−domain a c o u s t i c FD m o d e l i n g 

2 #include 

3 #include 

4 #include < r s f . hh> 

5 #include 

6 #include 

7 using namespace s t d ; 

8 


10 { 

11 // L a p l a c i a n c o e f f i c i e n t s 

12 f l o a t c0 = −30./12. , c1 =+16./12. , c2=− 1 . / 1 2 . ; 

13 

14 s f i n i t ( argc , argv ) ; // i n i t RSF 

15 bool verb ; // v e b o s e f l a g 

16 i f ( ! s f g e t b o o l ( ” verb ” ,& verb ) ) verb =0; 

17 

18 // s e t u p I /O f i l e s 

19 CUB Fw( ” i n ” , ” i ” ) ; Fw . headin ( ) ; //Fw . r e p o r t ( ) ; 

20 CUB Fv ( ” v e l ” , ” i ” ) ; Fv . headin ( ) ; // Fv . r e p o r t ( ) ; 

21 CUB Fr ( ” r e f ” , ” i ” ) ; Fr . headin ( ) ; // Fr . r e p o r t ( ) ; 

22 CUB Fo ( ” out ” , ”o” ) ; Fo . s e t u p ( 3 , Fv . e s i z e ( ) ) ; 

23 

24 // Read / Write a x e s 

25 s f a x i s at = Fw . g e t a x ( 0 ) ; int nt = s f n ( at ) ; f l o a t dt = s f d ( at ) ; 

26 s f a x i s az = Fv . g e t a x ( 0 ) ; int nz = s f n ( az ) ; f l o a t dz = s f d ( az ) ; 

27 s f a x i s ax = Fv . g e t a x ( 1 ) ; int nx = s f n ( ax ) ; f l o a t dx = s f d ( ax ) ; 

28 

29 Fo . putax ( 0 , az ) ; 

30 Fo . putax ( 1 , ax ) ; 

31 Fo . putax ( 2 , at ) ; 

32 Fo . headou ( ) ; 

33 

34 f l o a t dt2 = dt ∗ dt ; 

35 f l o a t i d z = 1/( dz ∗ dz ) ; 

36 f l o a t i d x = 1/( dx∗dx ) ; 

37 

38 // r e a d w a v e l e t , v e l o c i t y and r e f l e c t i v i t y 

39 v a l a r r a y ww( nt ) ; ww=0; Fw >> ww; 

40 v a l a r r a y vv ( nz ∗nx ) ; vv =0; Fv >> vv ; 

41 v a l a r r a y r r ( nz ∗nx ) ; r r =0; Fr >> r r ; 

42 

43 // a l l o c a t e t e m p o r a r y a r r a y s 

44 v a l a r r a y um( nz ∗nx ) ; um=0; 

45 v a l a r r a y uo ( nz ∗nx ) ; uo =0; 

46 v a l a r r a y up ( nz ∗nx ) ; up=0; 

47 v a l a r r a y ud ( nz ∗nx ) ; ud=0; 

48 

49 // i n i t ValArray I n d e x c o u n t e r 

50 VAI k ( nz , nx ) ; 

51 

52 // MAIN LOOP 

53 i f ( verb ) c e r r


1. Declare input, output and auxiliary file cubes (of type CUB). 

19 CUB Fw( ” i n ” , ” i ” ) ; Fw . headin ( ) ; //Fw . r e p o r t ( ) ; 

20 CUB Fv ( ” v e l ” , ” i ” ) ; Fv . headin ( ) ; // Fv . r e p o r t ( ) ; 

21 CUB Fr ( ” r e f ” , ” i ” ) ; Fr . headin ( ) ; // Fr . r e p o r t ( ) ; 

22 CUB Fo ( ” out ” , ”o” ) ; Fo . s e t u p ( 3 , Fv . e s i z e ( ) ) ; 

2. Declare, read and write RSF cube axes: at time axis, ax space axis, az depth 

axis. 

25 s f a x i s at = Fw . g e t a x ( 0 ) ; int nt = s f n ( at ) ; f l o a t dt = s f d ( at ) ; 

26 s f a x i s az = Fv . g e t a x ( 0 ) ; int nz = s f n ( az ) ; f l o a t dz = s f d ( az ) ; 

27 s f a x i s ax = Fv . g e t a x ( 1 ) ; int nx = s f n ( ax ) ; f l o a t dx = s f d ( ax ) ; 

28 

29 Fo . putax ( 0 , az ) ; 

30 Fo . putax ( 1 , ax ) ; 

31 Fo . putax ( 2 , at ) ; 

32 Fo . headou ( ) ; 

3. Declare multi-dimensional valarrays for input, output and read data. 

39 v a l a r r a y ww( nt ) ; ww=0; Fw >> ww; 

40 v a l a r r a y vv ( nz ∗nx ) ; vv =0; Fv >> vv ; 

41 v a l a r r a y r r ( nz ∗nx ) ; r r =0; Fr >> r r ; 

4. Declare multi-dimensional valarrays for temporary storage. 

44 v a l a r r a y um( nz ∗nx ) ; um=0; 

45 v a l a r r a y uo ( nz ∗nx ) ; uo =0; 

46 v a l a r r a y up ( nz ∗nx ) ; up=0; 

47 v a l a r r a y ud ( nz ∗nx ) ; ud=0; 

5. Initialize multidimensional valarray index counter (of type VAI). 

50 VAI k ( nz , nx ) ; 

6. Loop over time. 

54 f o r ( int i t =0; i t


FORTRAN 90 PROGRAM 

1 ! time−domain a c o u s t i c FD m o d e l i n g 

2 program AFDMf90 

3 use r s f 

4 

5 i m p l i c i t none 

6 

7 ! L a p l a c i a n c o e f f i c i e n t s 

8 r e a l : : c0 = −30./12. , c1 =+16./12. , c2=− 1 . / 1 2 . 

9 

10 l o g i c a l : : verb ! v e r b o s e f l a g 

11 type ( f i l e ) : : Fw, Fv , Fr , Fo ! I /O f i l e s 

12 type ( axa ) : : at , az , ax ! c u b e a x e s 

13 integer : : i t , i z , i x ! i n d e x v a r i a b l e s 

14 r e a l : : idx , idz , dt2 

15 

16 real , a l l o c a t a b l e : : vv ( : , : ) , r r ( : , : ) , ww ( : ) ! I /O a r r a y s 

17 real , a l l o c a t a b l e : : um ( : , : ) , uo ( : , : ) , up ( : , : ) , ud ( : , : ) ! tmp a r r a y s 

18 

19 c a l l s f i n i t ( ) ! i n i t RSF 

20 c a l l f r o m p a r ( ” verb ” , verb , . f a l s e . ) 

21 

22 ! s e t u p I /O f i l e s 

23 Fw=r s f i n p u t ( ” i n ” ) 

24 Fv=r s f i n p u t ( ” v e l ” ) 

25 Fr=r s f i n p u t ( ” r e f ” ) 

26 Fo=r s f o u t p u t ( ” out ” ) 

27 

28 ! Read / Write a x e s 

29 c a l l i a x a (Fw, at , 1 ) ; c a l l i a x a ( Fv , az , 1 ) ; c a l l i a x a ( Fv , ax , 2 ) 

30 c a l l oaxa ( Fo , az , 1 ) ; c a l l oaxa ( Fo , ax , 2 ) ; c a l l oaxa ( Fo , at , 3 ) 

31 

32 dt2 = at%d∗ at%d 

33 i d z = 1/( az%d∗ az%d ) 

34 i d x = 1/( ax%d∗ax%d ) 

35 

36 ! r e a d w a v e l e t , v e l o c i t y & r e f l e c t i v i t y 

37 a l l o c a t e (ww( at%n ) ) ; ww= 0 . ; c a l l r s f r e a d (Fw,ww) 

38 a l l o c a t e ( vv ( az%n , ax%n ) ) ; vv = 0 . ; c a l l r s f r e a d ( Fv , vv ) 

39 a l l o c a t e ( r r ( az%n , ax%n ) ) ; r r = 0 . ; c a l l r s f r e a d ( Fr , r r ) 

40 

41 ! a l l o c a t e t e m p o r a r y a r r a y s 

42 a l l o c a t e (um( az%n , ax%n ) ) ; um=0. 

43 a l l o c a t e ( uo ( az%n , ax%n ) ) ; uo =0. 

44 a l l o c a t e ( up ( az%n , ax%n ) ) ; up=0. 

45 a l l o c a t e ( ud ( az%n , ax%n ) ) ; ud=0. 

46 

47 ! MAIN LOOP 

48 do i t =1, at%n 

49 i f ( verb ) write ( 0 , ∗ ) i t 

50 

51 ! 4 t h o r d e r l a p l a c i a n 

52 do i z =2, az%n−2 

53 do i x =2,ax%n−2 

54 ud ( i z , i x ) = & 

55 c0 ∗ uo ( i z , i x ) ∗ ( i d x + i d z ) + & 

56 c1 ∗( uo ( i z , ix −1) + uo ( i z , i x +1))∗ i d x + & 


58 c1 ∗( uo ( i z −1, i x ) + uo ( i z +1, i x ) ) ∗ i d z + & 

59 c2 ∗( uo ( i z −2, i x ) + uo ( i z +2, i x ) ) ∗ i d z 

60 end do 

61 end do 

62 

63 ! i n j e c t w a v e l e t 

64 ud = ud − ww( i t ) ∗ r r 

65 

66 ! s c a l e by v e l o c i t y 

67 ud= ud ∗vv∗vv 

68 

69 ! t i m e s t e p 

70 up = 2∗ uo − um + ud ∗ dt2 

71 um = uo 

72 uo = up 

73 

74 ! w r i t e w a v e f i e l d t o o u t p u t 

75 c a l l r s f w r i t e ( Fo , uo ) 

76 end do 

77 

78 c a l l e x i t ( 0 ) 

79 end program AFDMf90


• Declare input, output and auxiliary file tags. 

11 type ( f i l e ) : : Fw, Fv , Fr , Fo ! I /O f i l e s 

• Declare RSF cube axes: at time axis, ax space axis, az depth axis. 

12 type ( axa ) : : at , az , ax ! c u b e a x e s 

• Declare multi-dimensional arrays for input, output and computations. 

16 real , a l l o c a t a b l e : : vv ( : , : ) , r r ( : , : ) , ww ( : ) ! I /O a r r a y s 

17 real , a l l o c a t a b l e : : um ( : , : ) , uo ( : , : ) , up ( : , : ) , ud ( : , : ) ! tmp a r r a y s 

• Open files for input/output. 

23 Fw=r s f i n p u t ( ” i n ” ) 

24 Fv=r s f i n p u t ( ” v e l ” ) 

25 Fr=r s f i n p u t ( ” r e f ” ) 

26 Fo=r s f o u t p u t ( ” out ” ) 

• Read axes from input files; write axes to output file. 

29 c a l l i a x a (Fw, at , 1 ) ; c a l l i a x a ( Fv , az , 1 ) ; c a l l i a x a ( Fv , ax , 2 ) 

30 c a l l oaxa ( Fo , az , 1 ) ; c a l l oaxa ( Fo , ax , 2 ) ; c a l l oaxa ( Fo , at , 3 ) 

• Allocate arrays and read wavelet, velocity and reflectivity. 

37 a l l o c a t e (ww( at%n ) ) ; ww= 0 . ; c a l l r s f r e a d (Fw,ww) 

38 a l l o c a t e ( vv ( az%n , ax%n ) ) ; vv = 0 . ; c a l l r s f r e a d ( Fv , vv ) 

39 a l l o c a t e ( r r ( az%n , ax%n ) ) ; r r = 0 . ; c a l l r s f r e a d ( Fr , r r ) 

• Allocate temporary arrays. 

42 a l l o c a t e (um( az%n , ax%n ) ) ; um=0. 

43 a l l o c a t e ( uo ( az%n , ax%n ) ) ; uo =0. 

44 a l l o c a t e ( up ( az%n , ax%n ) ) ; up=0. 

45 a l l o c a t e ( ud ( az%n , ax%n ) ) ; ud=0. 

• Loop over time. 

48 do i t =1, at%n 

• Compute Laplacian: ∆U. 

52 do i z =2, az%n−2 

53 do i x =2,ax%n−2 

54 ud ( i z , i x ) = & 

55 c0 ∗ uo ( i z , i x ) ∗ ( i d x + i d z ) + & 



58 c1 ∗( uo ( i z −1, i x ) + uo ( i z +1, i x ) ) ∗ i d z + & 

59 c2 ∗( uo ( i z −2, i x ) + uo ( i z +2, i x ) ) ∗ i d z 

60 end do 

61 end do 

• Inject source wavelet: [∆U − f(t)] 

64 ud = ud − ww( i t ) ∗ r r 

• Scale by velocity: [∆U − f(t)] v 2 

67 ud= ud ∗vv∗vv 

• Time step: U i+1 = [∆U − f(t)] v 2 ∆t 2 + 2U i − U i−1 

70 up = 2∗ uo − um + ud ∗ dt2 

71 um = uo 

72 uo = up


Reproducible computational experiments 

using SCons 

Sergey Fomel 1 and Gilles Hennenfent 21 

ABSTRACT 

SCons (from Software Construction) is a well-known open-source program designed 

primarily for building software. In this paper, we describe our method of 

extending SCons for managing data processing flows and reproducible computational 

experiments. We demonstrate our usage of SCons with a couple of simple 

examples. 

INTRODUCTION 

This paper introduces an environment for reproducible computational experiments 

developed as part of the “Madagascar” software package. To reproduce the example 

experiments in this paper, you can download Madagascar from http://www.ahay. 

org/. At the moment, the main Madagascar interface is the Unix shell command 

line so that you will need a Unix/POSIX system (Linux, Mac OS X, Solaris, etc.) or 

Unix emulation under Windows (Cygwin, SFU, etc.) 

Our focus, however, is not only on particular tools we use in our research but also 

on the general philosophy of reproducible computations. 

Reproducible research philosophy 

Peer review is the backbone of scientific progress. From the ancient alchemists, who 

worked in secret on magic solutions to insolvable problems, the modern science has 

come a long way to become a social enterprise, where hypotheses, theories, and experimental 

results are openly published and verified by the community. By reproducing 

and verifying previously published research, a researcher can take new steps to advance 

the progress of science. 

Traditionally, scientific disciplines are divided into theoretical and experimental 

studies. Reproduction and verification of theoretical results usually requires only 

1 University of Texas at Austin, E-mail: sergey.fomel@beg.utexas.edu 

2 Earth & Ocean Sciences, University of British Columbia, E-mail: ghennenfent@eos.ubc.ca 

1 e-mail: paul.sava@beg.utexas.edu 

143

144 Fomel & Hennenfent Madagascar Documentation 

imagination (apart from pencils and paper), experimental results are verified in laboratories 

using equipment and materials similar to those described in the publication. 

During the last century, computational studies emerged as a new scientific discipline. 

Computational experiments are carried out on a computer by applying numerical 

algorithms to digital data. How reproducible are such experiments? On one hand, 

reproducing the result of a numerical experiment is a difficult undertaking. The reader 

needs to have access to precisely the same kind of input data, software and hardware 

as the author of the publication in order to reproduce the published result. It is often 

difficult or impossible to provide detailed specifications for these components. On the 

other hand, basic computational system components such as operating systems and 

file formats are getting increasingly standardized, and new components can be shared 

in principle because they simply represent digital information transferable over the 

Internet. 

The practice of software sharing has fueled the miraculously efficient development 

of Linux, Apache, and many other open-source software projects. Its proponents 

often refer to this ideology as an analog of the scientific peer review tradition. Eric 

Raymond, a well-known open-source advocate, writes (Raymond, 2004): 

Abandoning the habit of secrecy in favor of process transparency and peer 

review was the crucial step by which alchemy became chemistry. In the 

same way, it is beginning to appear that open-source development may 

signal the long-awaited maturation of software development as a discipline. 

While software development is trying to imitate science, computational science needs 

to borrow from the open-source model in order to sustain itself as a fully scientific 

discipline. In words of Randy LeVeque, a prominent mathematician (LeVeque, 2006), 

Within the world of science, computation is now rightly seen as a third 

vertex of a triangle complementing experiment and theory. However, as 

it is now often practiced, one can make a good case that computing is 

the last refuge of the scientific scoundrel [...] Where else in science can 

one get away with publishing observations that are claimed to prove a 

theory or illustrate the success of a technique without having to give a 

careful description of the methods used, in sufficient detail that others 

can attempt to repeat the experiment? [...] Scientific and mathematical 

journals are filled with pretty pictures these days of computational experiments 

that the reader has no hope of repeating. Even brilliant and 

well intentioned computational scientists often do a poor job of presenting 

their work in a reproducible manner. The methods are often very vaguely 

defined, and even if they are carefully defined, they would normally have 

to be implemented from scratch by the reader in order to test them. 

In computer science, the concept of publishing and explaining computer programs 

goes back to the idea of literate programming promoted by Knuth (1984) and ex-

Madagascar Documentation Reproducible research 145 

pended by many other researchers (Thimbleby, 2003). In his 2004 lecture on “better 

programming”, Harold Thimbleby notes 2 

We want ideas, and in particular programs, that work in one place to work 

elsewhere. One form of objectivity is that published science must work 

elsewhere than just in the author’s laboratory or even just in the author’s 

imagination; this requirement is called reproducibility. 

Nearly ten years ago, the technology of reproducible research in geophysics was 

pioneered by Jon Claerbout and his students at the Stanford Exploration Project 

(SEP). SEP’s system of reproducible research requires the author of a publication to 

document creation of numerical results from the input data and software sources to 

let others test and verify the result reproducibility (Claerbout, 1992a; Schwab et al., 

2000). The discipline of reproducible research was also adopted and popularized in 

the statistics and wavelet theory community by Buckheit and Donoho (1995). It 

is referenced in several popular wavelet theory books (Hubbard, 1998; Mallat, 1999). 

Pledges for reproducible research appear nowadays in fields as diverse as bioinformatics 

(Gentleman et al., 2004), geoinformatics (Bivand, 2006), and computational wave 

propagation (LeVeque, 2006). However, the adoption or reproducible research practice 

by computational scientists has been slow. Partially, this is caused by difficult 

and inadequate tools. 

Tools for reproducible research 

The reproducible research system developed at Stanford is based on “make (Stallman 

et al., 2004)”, a Unix software construction utility. Originally, SEP used “cake”, a 

dialect of “make” (Nichols and Cole, 1989; Claerbout and Nichols, 1990; Claerbout, 

1992b; Claerbout and Karrenbach, 1993). The system was converted to “GNU make”, 

a more standard dialect, by Schwab and Schroeder (1995). The “make” program 

keeps track of dependencies between different components of the system and the 

software construction targets, which, in the case of a reproducible research system, 

turn into figures and manuscripts. The targets and commands for their construction 

are specified by the author in “makefiles”, which serve as databases for defining source 

and target dependencies. A dependency-based system leads to rapid development, 

because when one of the sources changes, only parts that depend on this source get 

recomputed. Buckheit and Donoho (1995) based their system on MATLAB, a popular 

integrated development environment produced by MathWorks (Sigmon and Davis, 

2001). While MATLAB is an adequate tool for prototyping numerical algorithms, it 

may not be sufficient for large-scale computations typical for many applications in 

computational geophysics. 

“Make” is an extremely useful utility employed by thousands of software development 

projects. Unfortunately, it is not well designed from the user experience 

2 http://www.uclic.ucl.ac.uk/harold/


prospective. “Make” employs an obscure and limited special language (a mixture of 

Unix shell commands and special-purpose commands), which often appears confusing 

to unexperienced users. According to Peter van der Linden, a software expert from 

Sun Microsystems (van der Linden, 1994), 

“Sendmail” and “make” are two well known programs that are pretty 

widely regarded as originally being debugged into existence. That’s why 

their command languages are so poorly thought out and difficult to learn. 

It’s not just you – everyone finds them troublesome. 

The inconvenience of “make” command language is also in its limited capabilities. 

The reproducible research system developed by Schwab et al. (2000) includes not 

only custom “make” rules but also an obscure and hardly portable agglomeration of 

shell and Perl scripts that extend “make” (Fomel et al., 1997). 

Several alternative systems for dependency-checking software construction have 

been developed in recent years. One of the most promising new tools is SCons, enthusiastically 

endorsed by Dubois (2003). The SCons initial design won the Software 

Carpentry competition sponsored by Los Alamos National Laboratory in 2000 in the 

category of “a dependency management tool to replace make”. Some of the main 

advantages of SCons are: 

• SCons configuration files are Python scripts. Python is a modern programming 

language praised for its readability, elegance, simplicity, and power (Rossum, 

2000a,b). Scales and Ecke (2002) recommend Python as the first programming 

language for geophysics students. 

• SCons offers reliable, automatic, and extensible dependency analysis and creates 

a global view of all dependencies – no more “make depend”, “make clean”, 

or multiple build passes of touching and reordering targets to get all of the 

dependencies. 

• SCons has built-in support for many programming languages and systems: C, 

C++, Fortran, Java, LaTeX, and others. 

• While “make” relies on timestamps for detecting file changes (creating numerous 

problems on platforms with different system clocks), SCons uses by default a 

more reliable detection mechanism employing MD5 signatures. It can detect 

changes not only in files but also in commands used to build them. 

• SCons provides integrated support for parallel builds. 

• SCons provides configuration support analogous to the “autoconf” utility for 

testing the environment on different platforms. 

• SCons is designed from the ground up as a cross-platform tool. It is known to 

work equally well on both POSIX systems (Linux, Mac OS X, Solaris, etc.) and 

Windows.


• The stability of SCons is assured by an incremental development methodology 

utilizing comprehensive regression tests. 

• SCons is publicly released under a liberal open-source license 3 

In this paper, we propose to adopt SCons as a new platform for reproducible 

research in scientific computing. 

Paper organization 

We first give a brief overview of “Madagascar” software package and define the different 

levels of user interactions. To demonstrate our adoption of SCons for reproducible 

research, we then describe a couple of simple examples of computational experiments 

and finally show how SCons helps us document our computational results. 

MADAGASCAR SOFTWARE PACKAGE OVERVIEW 

Report/paper 

(SCons + LaTeX) 

Book 


Report/paper 


Documention 

(PDF & HTML) 

Processing flow 

(SCons + Python) 





Processing flows 

Program 

(Matlab) 

Program 

(Mathematica) 

Program 

(Python) 

Program 

(C++) 

Program 

(Fortran) 

Program 

(C) 

Program 

(SU) 

Program 

(SEP) 

Program 

(Delphi) 

Command line 

Figure 1: caption scons/. diag 

“Madagascar” is a multi-layered software package (Fig. 1). Users can thus use it 

in different ways: 

3 As of time of this writing, SCons is in a beta version 0.96 approaching the 1.0 official release. 

See http://www.scons.org/.


• command line: “Madagascar” is first of all a collection of command line 

programs. Most programs act as filters on input data and can be chained in a 

Unix pipeline, e.g. 

sfspike n1=200 n2=50 | sfnoise rep=y >noise.rsf 

Although these programs mainly focus at this point on geophysical applications, 

users can use the API (application programmer’s interface) for writing their own 

software to manipulate Regularly Sampled Format (RSF) files, “Madagascar” 

file format. The main software language of “Madagascar” is C. Interfaces to 

other languages (C++, Fortran-77, Fortran-90, Python) are also provided. 

• processing flows: “Madagascar” is also an environment for reproducible numerical 

experiments in a very broad sense. These numerical experiments (or 

“computational recipes”) can be done not only using “Madagascar” command 

line programs but also Matlab, Mathematica, Python, or other seismic packages 

(e.g. SEP, Seismic Unix). We adopted SCons for this part as we shall 

demonstrate later. 

• documentation: the most upper layer of “Madagascar” and maybe the most 

critical for reproducible research is documentation. “Madagascar” establishes 

a direct link between the figures of a paper or a report and the codes that 

were used to generate them. This layer uses SCons in combination with L A TEX 

to generate PDF, HTML, and MediaWiki files real easy and undoubtly makes 

“Madagascar” an environment of choice for technology transfer, report, thesis, 

and peer-reviewed publication writing. 

EXAMPLE EXPERIMENTS 

The main SConstruct commands defined in our reproducible research environment 

are collected in Table 1. 

These commands are defined in $RSFROOT/lib/rsfproj.py where RSFROOT is the 

environmental variable to the Madagascar installation directory. The source of this 

file is in python/rsfproj.py. 

Example 1 

To follow the first example, select a working project directory and copy the following 

code to a file named SConstruct 4 . 


2 

4 The source of this file is also accessible at book/rsf/scons/easystart/SConstruct.


Fetch(data file,dir[,ftp server info]) 

A rule to download from a specific directory of an 

FTP server . 

Flow(target[s],source[s],command[s][,stdin][,stdout]) 

A rule to generate from using command[s] 

Plot(intermediate plot[,source],plot command) or 

Plot(intermediate plot,intermediate plots,combination) 

A rule to generate in the working directory. 

Result(plot[,source],plot command) or 

Result(plot,intermediate plots,combination) 

A rule to generate a final in the special Fig folder of the working 

directory. 

End() 

A rule to collect default targets. 

Table 1: Basic methods of an rsf.proj object. 

3 # Download the input data f i l e 

4 Fetch ( ’ lena . img ’ , ’ imgs ’ ) 

5 

6 # Create RSF header 

7 Flow ( ’ lena . hdr ’ , ’ lena . img ’ , 

8 ’ echo n1=512 n2=513 in=$SOURCE data format=n a t i v e u c h a r ’ , 

9 s t d i n =0) 

10 

11 # Convert to f l o a t i n g p o i n t and window out f i r s t t r a c e 

12 Flow ( ’ lena ’ , ’ lena . hdr ’ , ’ dd type=f l o a t | window f 2=1 ’ ) 

13 

14 # Display 

15 Result ( ’ lena ’ , 

16 ’ ’ ’ 

17 s f g r e y t i t l e =”Hello , World ! ” transp=n c o l o r=b b i a s =128 

18 c l i p =100 s c r e e n r a t i o =1 

19 ’ ’ ’ ) 

20 

21 # Wrap up 

22 End ( )


This is our “hello world” example that illustrates the basic use of some of the 

commands presented in Table 1. The plan for this experiment is simply to download 

data from a public data server, to convert it to an appropriate file format and to 

generate a figure for publication. But let us have a closer look at the SConstruct 

script and try to decorticate it. 


is a standard Python command that loads the Madagascar project management module 

rsf.proj.py which provides our extension to SCons. 

4 Fetch ( ’ lena . img ’ , ’ imgs ’ ) 

instructs SCons to connect to a public data server (the default server if no FTP server 

information is provided) and to fetch the data file lena.img from the data/imgs 

directory. Try running “scons lena.img” on the command line. The successful 

output should look like 

bash$ scons lena.img 

scons: Reading SConscript files ... 

scons: done reading SConscript files. 

scons: Building targets ... 

retrieve(["lena.img"], []) 

scons: done building targets. 

with the target file lena.img appearing in your directory. In the following examples, 

we will use -Q (quiet) option of scons to suppress the verbose output. 

7 Flow ( ’ lena . hdr ’ , ’ lena . img ’ , 

8 ’ echo n1=512 n2=513 in=$SOURCE data format=n a t i v e u c h a r ’ , 

9 s t d i n =0) 

prepares the Madagascar header file lena.hdr using the standard Unix command 

echo. 

bash$ scons -Q lena.hdr 

echo n1=512 n2=513 in=lena.img data_format=native_uchar > lena.hdr 

Since echo does not take a standard input, stdin is set to 0 in the Flow command 

otherwise the first source is the standard input. Likewise, the first target is the 

standard output unless otherwise specified. Note that lena.img is referred as $SOURCE 

in the command. This allows us to change the name of the source file without changing 

the command.


The data format of the lena.img image file is uchar (unsigned character), the 

image consists of 513 traces with 512 samples per trace. Our next step is to convert the 

image representation to floating point numbers and to window out the first trace so 

that the final image is a 512 by 512 square. The two transformations are conveniently 

combined into one with the help of a Unix pipe. 

12 Flow ( ’ lena ’ , ’ lena . hdr ’ , ’ dd type=f l o a t | window f 2=1 ’ ) 

bash$ scons -Q lena 

scons: *** Do not know how to make target ‘lena’. 

Stop. 

What happened? In the absence of the file suffix, the Flow command assumes that 

the target file suffix is “.rsf”. Let us try again. 

scons -Q lena.rsf 

< lena.hdr /RSF/bin/sfdd type=float | /RSF/bin/sfwindow f2=1 > lena.rsf 

Notice that Madagascar modules sfdd and sfwindow get substituted for the corresponding 

short names in the SConstruct file. The file lena.rsf is in a regularly 

sampled format 5 and can be examined, for example, with sfin lena.rsf 6 . 

bash$ sfin lena.rsf 

lena.rsf: 

in="/datapath/lena.rsf@" 


n1=512 d1=1 o1=0 

n2=512 d2=1 o2=1 


In the last step, we will create a plot file for displaying the image on the screen and 

for including it in the publication. 

15 Result ( ’ lena ’ , 

16 ’ ’ ’ 

17 s f g r e y t i t l e =”Hello , World ! ” transp=n c o l o r=b b i a s =128 

18 c l i p =100 s c r e e n r a t i o =1 

19 ’ ’ ’ ) 

Notice that we broke the long command string into multiple lines by using Python’s 

triple quote syntax. All the extra white space will be ignored when the multiple line 

string gets translated into the command line. The Result command has special targets 

associated with it. Try, for example, “scons lena.view” to observe the figure 

Fig/lena.vpl generated in a specially created Fig directory and displayed on the


Figure 2: The output of the 

first numerical experiment. 

scons/easystart lena 

screen. The output should look like Figure 2. 

The reproducible script ends with 

22 End ( ) 

Ready to experiment? Try some of the following: 

1. Run scons -c. The -c (clean) option tells SCons to remove all default targets 

(the Fig/lena.vpl image file in our case) and also all intermediate targets that 

it generated. 

bash$ scons -c -Q 

Removed lena.img 

Removed lena.hdr 

Removed lena.rsf 

Removed /datapath/lena.rsf@ 

Removed Fig/lena.vpl 

Run scons again, and the default target will be regenerated. 

bash$ scons -Q 

retrieve(["lena.img"], []) 

echo n1=512 n2=513 in=lena.img data_format=native_uchar > lena.hdr 

< lena.hdr /RSF/bin/sfdd type=float | /RSF/bin/sfwindow f2=1 > lena.rsf 

< lena.rsf /RSF/bin/sfgrey title="Hello, World!" transp=n color=b \ 

bias=128 clip=100 screenratio=1 > Fig/lena.vpl 

2. Edit your SConstruct file and change some of the plotting parameters. For 

example, change the value of clip from clip=100 to clip=50. Run scons 

again and observe that only the last part of the processing flow (precisely, the 

part affected by the parameter change) is being run: 

5 See http://rsf.sourceforge.net/wiki/index.php/Format 

6 See http://rsf.sourceforge.net/wiki/index.php/Programs#sfin.


bash$ scons -Q view 

< lena.rsf /RSF/bin/sfgrey title="Hello, World!" transp=n color=b \ 

bias=128 clip=50 screenratio=1 > Fig/lena.vpl 

/RSF/bin/xtpen Fig/lena.vpl 

SCons is smart enough to recognize that your editing did not affect any of the 

previous results in the data flow chain! Keeping track of dependencies is the 

main feature that separates data processing and computational experimenting 

with SCons from using linear shell scripts. For computationally demanding 

data processing, this feature can save you a lot of time and can make your 

experiments more interactive and enjoyable. 

3. A special parameter to SCons (defined in rsf.proj.py) can time the execution 

of each step in the processing flow. Try running scons TIMER=y. 

4. The rsf.proj module has direct access to the database that stores parameters 

of all Madagascar modules. Try running scons CHECKPAR=y to see parameter 

checking enforced before computations 7 . 

The summary of our SCons commands is given in Table 2. 

Example 2 

The plan for this experiment is to add random noise to the test “Lena” image and then 

to attempt removing it by low-pass filtering and by hard thresholding of coefficients 

in the Fourier domain. The result images are shown in Figure 3 and 4. 

Since the SConstruct| file is a Python script, we can also use all the flexibility and 

power of the Python language in our Madagascar reproducible scripts. A demo script 

is available in the rsf/scons/rsfpy subdirectory of the Madagascar book directory. 

Rather than commenting it line-by-line, we select some parts of interest. 

In the SConstruct script, we can declare Python variables 

3 b i a s = 128 

and use them later, for example, to define our customized plot command as a Python 

function 

5 def grey ( t i t l e , transp=’n ’ , b i a s=b i a s ) : 

6 return ’ ’ ’ 

7 s f g r e y t i t l e=”%s ” transp=%s b i a s=%g c l i p =100 

8 s c r e e n h t =10 screenwd=10 crowd2 =0.85 crowd1 =0.8 

9 l a b e l 1= l a b e l 2= 

10 ’ ’ ’ % ( t i t l e , transp , b i a s ) 

7 This feature is new and experimental and may not work properly yet


scons 

Generate (usually requires .rsf suffix for Flow targets and .vpl 

suffix for Plot targets.) 

scons 

Generate default targets (usually figures specified in Result.) 

scons view or scons .view 

Generate Result figures and display them on the screen. 

scons print or scons .print 

Generate Result figures and print them. 

scons lock or scons .lock 

Generate Result figures and install them in a separate location. 

scons test or scons .test 

Generate Result figures and compare them with the corresponding “locked” 

figures stored in a separate location (regression testing). 

scons .flip 

Generate the figure and compare it with the corresponding 

“locked” figure stored in a separate location by flipping between the two 

figures on the screen. 

scons TIMER=y ... 

Time the execution of each step in the processing flow (using the Unix time 

utility.) 

scons CHECKPAR=y ... 

Check the names and values of all parameters supplied to Madagascar modules 

in the processing flows before executing anything (guards against incorrect 

input.) This option is new and experimental. 

Table 2: SCons commands and options defined in rsf.proj.


Figure 3: Top left: original image. Top right: random noise added. Bottom left: 

original image spectrum in the Fourier (F -X) domain. Bottom right: noisy image 

spectrum in the Fourier (F -X) domain. scons/rsfpy panel1 

Figure 4: Left: denoising by low-pass filtering. Right: denoising by hard thresholding 

in the Fourier domain. scons/rsfpy panel2


This Python function, named grey(), can then be called in Plot or Result commands, 

e.g. 

48 Plot ( ’ l p l e n a ’ , grey ( ’ Noisy Lena LP f i l t e r e d ’ ) ) 

We can define a Python dictionary, e.g. 

34 t i t l e s = { ’ lena ’ : ’ Lena ’ , 

35 ’ nlena ’ : ’ Noisy Lena ’ } 

and loop over its entries, e.g. 

36 for name in t i t l e s . keys ( ) : 

37 Plot (name , grey ( t i t l e s [ name ] ) ) 

38 c f t i t l e = t i t l e s [ name]+ ’ in FX domain ’ 

39 Flow ( ’ fx ’+name , name , ’ s f s p e c t r a ’ ) 

40 Plot ( ’ fx ’+name , grey ( c f t i t l e , ’ y ’ , 1 0 0 ) ) 

Note that the title of the plots is obtained by concatenating Python strings. 

Python strings can also be used to define sequences of commands used in several 

Flows, e.g. 

65 # 2−D FFT 

66 f f t 2 = ’ s f f f t 1 sym=y | s f f f t 3 sym=y ’ 

67 Flow ( ’ f n l e n a ’ , ’ nlena ’ , f f t 2 ) 

Finally, in our Madagascar reproducible script, we may want the option to pass 

command line arguments when running SCons or use default values otherwise, e.g. 

69 # d e n o i s i n g using t h r e s h o l d i n g in the Fourier domain 

70 f t h r = f l o a t (ARGUMENTS. get ( ’ f t h r ’ , 70)) 

71 Flow ( ’ f t h r l e n a ’ , ’ f n l e n a ’ , ’ s f t h r thr=%f mode=”hard ” ’ % f t h r ) 

Running scons only, the default value set for fthr (i.e. 70) is used whereas running 

scons fthr=68 set fthr to a command line specified value. 

This is by no mean an exhaustive list of options but, hopefully, it gives you a 

flavor of the powerful tool you have in hands. Enjoy! 

CREATING REPRODUCIBLE DOCUMENTATION 

You are done with computational experiments and want to communicate them in a 

paper. SCons helps us create high-quality papers, where computational results (figures) 

are integrated with papers written in LA TEXṪhe corresponding SCons extension 

is defined in $RSFROOT/lib/rsftex.py where RSFROOT is the environmental variable 

to the Madagascar installation directory. The source of this file is in python/rsftex.py. 

We summarize the basic methods and commands in Tables 3 and 4.


Paper(,[,lclass][,use][,include][,options]) 

A rule to compile .tex L A TEX document using the L A TEX2e 

class specified in lclass (default is geophysics.cls from the SEGTeX package) 

with additional options specified in options, additional packages specified 

in use, and additional preamble specified in include 

End() 

A rule to collect default targets (referring to paper.tex document). 

Table 3: Basic methods of an rsf.tex object. 

scons 

Generate the default target (usually the PDF file paper.pdf from the source 

L A TEX file paper.tex.) 

scons pdf or scons .pdf 

Generate PDF files from L A TEX sources paper.tex or .tex. 

scons read or scons .read 

Generate PDF files from L A TEX sources paper.tex or .tex 

and display them on the screen. 

scons print or scons .print 

Generate PDF files from L A TEX sources paper.tex or .tex 

and print them. 

scons html or scons .html 

Generate HTML files from L A TEX sources paper.tex or .tex 

using L A TEXtoHTML. The directory html gets created. 

scons install or scons .install 

Generate PDF and HTML files from L A TEX sources paper.tex or 

.tex and install them in a separate location (used for publishing 

on a web site). 

scons wiki or scons .wiki 

Convert L A TEX sources paper.tex or .tex to the MediaWiki 

format (used for publishing on a Wiki web site). 

Table 4: SCons commands defined in rsf.tex.


Example 

This paper by itself is an example of a reproducible document. It is generated using 

the following SConstruct file which is place in the directory above the projects 

directories. 

1 from r s f . tex import ∗ 

2 Paper ( ’ velan ’ , use=’ hyperref , l i s t i n g s , c o l o r ’ ) 

3 End( use=’ hyperref , l i s t i n g s , c o l o r ’ , 

4 c o l o r=’ modl modl2 cdp1500 cdp2000 cdp2500 cdp3000 cdp3500 pick ’ ) 

This SConstruct generates this paper but it can also compile velan.tex in the 

same directory. Note that there is no Paper command for paper.tex since it is the 

default documentation name. Optional L A TEX packages and style used in paper.tex 

are passed in the End command. 

Let’s now have a closer look at paper.tex to understand how the figures of 

the documentation are linked to the reproducible scripts that created them. First 

of all, note that paper.tex is not a regular L A TEX document but only its body 

(no \documentclass, \usepackage, etc.). In our paper, Fig. 2 was created in the 

project folder easystart (sub-folder of our documentation folder) by the result plot 

lena.vpl. In the L A TEX source code, it translates as 

432 \ i n p u t d i r { e a s y s t a r t } 

433 \ s i d e p l o t { lena }{ height =.25\ textheight }{The output o f the f i r s t 

434 numerical experiment . } 

The \inputdir command points to the project directory and the \sideplot command 

calls . The L A TEX tag of the figure is fig:. 

The first time the paper is compiled, the result file is automatically converted to the 

PDF file format. 

REFERENCES 

Bivand, R., 2006, Implementing spatial data analysis software tools in r: Geographical 

Analysis, 38, 23–40. 

Buckheit, J., and D. L. Donoho, 1995, Wavelab and reproducible research, in Wavelets 

and Statistics: Springer-Verlag, 103, 55–81. 

Claerbout, J., 1992a, Electronic documents give reproducible research a new meaning: 

62nd Ann. Internat. Mtg, Soc. of Expl. Geophys., 601–604. 

——–, 1992b, How to use Cake with interactive documents, in SEP-73: Stanford 

Exploration Project, 451–460. 

Claerbout, J. F., and M. Karrenbach, 1993, How to use cake with interactive documents, 

in SEP-77: Stanford Exploration Project, 427–444.


Claerbout, J. F., and D. Nichols, 1990, Why active documents need cake, in SEP-67: 

Stanford Exploration Project, 145–148. 

Dubois, P. F., 2003, Why Johnny can’t build: Computing in Science & Engineering, 

5, 83–88. 

Fomel, S., M. Schwab, and J. Schroeder, 1997, Empowering SEP’s documents, in 

SEP-94: Stanford Exploration Project, 339–361. 

Gentleman, R. C., V. J. Carey, D. M. Bates, B. Bolstad, M. Dettling, S. Dudoit, B. 

Ellis, L. Gautier, Y. Ge, J. Gentry, K. Hornik, T. Hothorn, W. Huber, S. Iacus, R. 

Irizarry, F. Leisch, C. Li, M. Maechler, A. J. Rossini, G. Sawitzki, C. Smith, G. 

Smyth, L. Tierney, J. Y. Yang, and J. Zhang, 2004, Bioconductor: open software 

development for computational biology and bioinformatics: Genome Biology, 5, 

R80. 

Hubbard, B. B., 1998, The world according to wavelets: The story of a mathematical 

technique in the making: AK Peters. 

Knuth, D. E., 1984, Literate programming: Computer Journal, 27, 97–111. 

LeVeque, R. J., to appear, 2006, Wave propagation software, computational science, 

and reproducible research: Presented at the Proc. International Congress of Mathematicians. 

Mallat, S., 1999, A wavelet tour of signal processing: Academic Press. 

Nichols, D., and S. Cole, 1989, Device independent software installation with CAKE, 


Raymond, E. S., 2004, The art of UNIX programming: Addison-Wesley. 

Rossum, G. V., 2000a, Python reference manual: Iuniverse Inc. 

——–, 2000b, Python tutorial: Iuniverse Inc. 

Scales, J. A., and H. Ecke, 2002, What programming languages should we teach our 

undergraduates?: The Leading Edge, 21, 260–267. 

Schwab, M., M. Karrenbach, and J. Claerbout, 2000, Making scientific computations 

reproducible: Computing in Science & Engineering, 2, 61–67. 

Schwab, M., and J. Schroeder, 1995, Reproducible research documents using GNUmake, 


Sigmon, K., and T. A. Davis, 2001, MATLAB primer, sixth edition: Chapman & 

Hall. 

Stallman, R. M., R. McGrath, and P. D. Smith, 2004, GNU make: A program for 

directing recompilation: GNU Press. 

Thimbleby, H., 2003, Explaining code for publication: Software - Practice & Experience, 

33, 975–908. 

van der Linden, P., 1994, Expert C programming: Prentice Hall.

160 Fomel & Hennenfent Madagascar Documentation

single PDF file - Madagascar

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?