The few things to know about BIN/BINX files and their handling in R

by Sebastian Kreutzer (June 6, 2021)

Luminescence >= 0.9.0

Creative Commons
Attribution-NonCommercial-NoDerivatives 4.0
International License

1 What is meant with BIN/BINX files?
2 The structure of BIN/BINX files
- 2.1 @METADATA
- 2.2 @DATA
3 File processing and curve selection
4 What about RLum-objects?
5 Some final remarks
References

I’ve realised over some time now that many users seem to have struggles to efficiently process BIN/BINX files with the package Luminescence. The function to import BIN/BINX files was around even before the package was released, and Luminescence became a community project. If there is any ambiguity, I feel somehow obliged to lift the fog.

Having said that, with this tutorial, I will try to shed a little bit of light on BIN/BINX file handling using Luminescence, hopefully making it a perhaps more joyful experience.

1 What is meant with BIN/BINX files?

When I talk about BIN/BINX files, I refer to files with the ending *.bin or *.binx mainly produced by the commercially available Risø luminescence readers. These files contain the measurement data (typically everything that the photomultiplier detects) produced by the TL/OSL readers of that company in a binary file format. Over the many years these machines have been around, the file format design changed slightly, leading to at least six different versions. Version 3 to version 8, all supported Luminescence. Perhaps while I am writing this tutorial, there is already a new version around, I am just not (yet) aware of. If so, please notify me.

The important thing to know about these different versions is that they are not really compatible. The part of the file that includes the metadata (we will talk about it later), differ in length and partly in byte order. I realised this first when I, proud about my first R functions, could not import any more new files after we had updated the system software of our reader. The good news is that the Risø guys have always been very supportive and shared the format documentation, which allowed me to provide timely and good format support in Luminescence. A few more details about the format can be found by typing ?`Risoe.BINfileData-class` in the R terminal.

2 The structure of BIN/BINX files

The easiest way to import BIN/BINX files is to call the function read_BIN2R(). The function will automatically determine the format version. The file name extension does not matter, and both endings *.bin and *.binx (rule of thumb: everything >= V4 has the ending *.binx) are supported. So, the most straightforward code snippet reads:

library(Luminescence)
file <- "20101027_BT707_MAIN_CGQ.BIN"
bin_data <- read_BIN2R(file, txtProgressBar = FALSE)

## 
## [read_BIN2R()]
##   >> 20101027_BT707_MAIN_CGQ.BIN
##   >> 792 records have been read successfully!

Where file is a character to your BIN/BINX file. For this tutorial, I will use a dataset I have measured during my PhD. Measured was a quartz coarse grain sample from the loess section Seilitz in Saxony, Germany (Meszner et al. 2013). The parameter txtProgressBar = FALSE suppresses the import progress bar shown in the terminal, something that is not of relevance here.

The output of the function is an R object called Risoe.BINfileData-class. I cannot recall why I decided to make the name so long. I guess bad habit. When the object is called, it prints a summary of the object instead of flooding the terminal with data.

bin_data

## 
## [Risoe.BINfileData object]
## 
##  BIN/BINX version      3
##  Object date:          271020, 281020, 291020
##  User:                 Default
##  System ID:            150
##  Overall records:      792
##  Records type:         IRSL  (n = 36)
##                        OSL   (n = 504)
##                        TL    (n = 252)
##  Position range:       1 : 36
##  Grain range:          0 : 0
##  Run range:            1 : 8
##  Set range:            3 : 6

We learn that the file (here version 3) was produced somewhat end of October 20XX (the format dates back to a time when it was obviously hard to imagine that we make the millenniums transition) by a user sensibly called Default in a system with serial number 150. Further information shows the number of overall records (luminescence curves of a different type) and the number of assigned positions. Run and set range both refer to the measurement sequence design.

The object itself is something following the so-called S4 definition. Nothing of further relevance except for the magic operator to access elements (slots) of the object is the at @ symbol (?@). Alternatively, you can try str(bin_data). Personally I found this function never really helpful, in particular not for large objects.

2.1 `@METADATA`

Once imported, the object (here bin_data) contains elements called slots. One is METADATA, which is a data.frame, and it includes all metadata of the measurements as some kind of big spreadsheet you may have already seen in the central window of the software Analyst (Duller 2015).

head(bin_data@METADATA)

ID	SEL	VERSION	LENGTH	PREVIOUS	NPOINTS	RUN	SET	POSITION	CURVENO	SAMPLE	COMMENT	SYSTEMID	FNAME	USER	TIME	DATE	DTYPE	TAG	LTYPE	LIGHTSOURCE	LPOWER	LIGHTPOWER	HIGH	RATE	MEASTEMP	AN_TEMP	AN_TIME	IRR_DOSERATE	IRR_DOSERATEERR	TIMESINCEIRR	TIMETICK	STIMPERIOD	DEADTIME	MAXLPOWER	XRF_ACQTIME	XRF_HV	XRF_CURR	XRF_DEADTIMEF	DETECTOR_ID	LOWERFILTER_ID	UPPERFILTER_ID	ENOISEFACTOR	MARKPOS_X1	MARKPOS_Y1	MARKPOS_X2	MARKPOS_Y2	MARKPOS_X3	MARKPOS_Y3	SEQUENCE
1	TRUE	3	8272	0	2000	1	3	1	NA	BT 707 CGQ	Natural	150	20101027_BT707_MAIN_CGQ	Default	13:39:24	271020	Natural	1	OSL	Blue Diodes	90	90	40	5	NA	125	10	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	20101027
2	TRUE	3	8272	8272	2000	1	3	2	NA	BT 707 CGQ	Natural	150	20101027_BT707_MAIN_CGQ	Default	13:41:09	271020	Natural	1	OSL	Blue Diodes	90	90	40	5	NA	125	10	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	20101027
3	TRUE	3	8272	8272	2000	1	3	3	NA	BT 707 CGQ	Natural	150	20101027_BT707_MAIN_CGQ	Default	13:42:56	271020	Natural	1	OSL	Blue Diodes	90	90	40	5	NA	125	10	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	20101027
4	TRUE	3	8272	8272	2000	1	3	4	NA	BT 707 CGQ	Natural	150	20101027_BT707_MAIN_CGQ	Default	13:44:43	271020	Natural	1	OSL	Blue Diodes	90	90	40	5	NA	125	10	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	20101027
5	TRUE	3	8272	8272	2000	1	3	5	NA	BT 707 CGQ	Natural	150	20101027_BT707_MAIN_CGQ	Default	13:46:29	271020	Natural	1	OSL	Blue Diodes	90	90	40	5	NA	125	10	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	20101027
6	TRUE	3	8272	8272	2000	1	3	6	NA	BT 707 CGQ	Natural	150	20101027_BT707_MAIN_CGQ	Default	13:48:15	271020	Natural	1	OSL	Blue Diodes	90	90	40	5	NA	125	10	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	20101027

2.2 `@DATA`

The second slot, DATA, is of type list. The element contains the actual measurement data in the order they were recorded. In R they are represented as numeric vectors.

## show first 20 data points 
## of the first record
bin_data@DATA[[1]][1:20]

##  [1] 1057  952  876  798  835  730  726  646  650  553  526  562  507  443  398
## [16]  388  367  369  336  304

Every single row in METADATA refers to one record in DATA. The link between the two is the column ID in METADATA.

It is essential to understand that DATA only contains count data of the measurement, which means y values only. It looks odd, but the data are stored for memory efficiency reasons in the BIN/BINX files. The x values (time or temperatures values) are calculated on the fly if needed using the METADATA information.

3 File processing and curve selection

So, it appears that importing BIN/BINX files into R isn’t difficult after all because it is handled by the function read_BIN2R(). Regardless of whether you know how data are stored in the BIN/BINX file.

Unfortunately, usually, the import is only the first step. If you want to quickly select some of the relevant curves, plot or do other things; here is a list of some useful functions:

FUNCTION	PURPOSE
`read_BIN2R()`	Import BIN/BINX files into R
`write_R2BIN()`	Write content previously imported again back into a BIN/BINX file
`convert_BIN2CSV()`	Convert BIN/BINX files to CSV files to be processed with other software
`merge_Risoe.BINfileData()`	Merges BIN/BINX files or such objects previously imported with `read_BIN2R()`
`subset()`	Sub setting (extracting) parts of the data from the BIN/BINX file
`plot_Risoe.BINfileData()`	Plots the records in the file
`Risoe.BINfileData2RLum.Analysis()`	Converts the `Risoe.BINfileData` to `RLum.Data.Curve` and `RLum.Analysis` objects

3.1 Import and export

3.1.1 Import: `read_BIN2R()`

Importing (read) and exporting (write) data is an obvious task, but what else has the function to offer?

args(read_BIN2R)

## function (file, show.raw.values = FALSE, position = NULL, n.records = NULL, 
##     zero_data.rm = TRUE, duplicated.rm = FALSE, fastForward = FALSE, 
##     show.record.number = FALSE, txtProgressBar = TRUE, forced.VersionNumber = NULL, 
##     ignore.RECTYPE = FALSE, pattern = NULL, verbose = TRUE, ...) 
## NULL

First, there are a few technical parameters, such as txtProgressBar, verbose and show.record.number. Except if you are planning on writing a tutorial, you usually do not need these parameters because all will only change what is shown in the R terminal during import. Their setting do not alter the data import.

show.raw.values, forced.VersionNumber, and ignore.RECTYPE offer some kind of debugging functionality and error handling without accessing the underlying code. I am not sure whether anybody ever used these features.

zero_data.rm and duplicated.rm are very useful if something went wrong during the measurement because they clear the import from all broken or duplicated (it appears to happen a lot during single grain measurements).

pattern, n.records, pattern are more interesting. Let’s start with pattern. Like many functions in 'Luminescence', read_BIN2R() is designed to iterate automatically over large datasets. If you now provide only a path (for example, to a folder with many BIN/BINX files) in file, the first argument, pattern would take a character or a regular expression (?regex) to select only files with file names matching the pattern. For example,

read_BIN2R(file = "/myBIN_file_folder/", pattern = "Aberystwyth")

would only import BIN/BINX files where it finds the word “Aberystwyth.”

The arguments position and n.records allow you to limit the import to a particular position range or a number of records.

## import only records from position 1
read_BIN2R(file, position = 1)

## import only the first 100 records (regardless the position number)
read_BIN2R(file, n.records = 100)

Side note: Unlike the selection of records, the selection of only one position will not speed up the import of the file because until all records are imported, the function does not know whether a position comes up again or not.

3.1.2 Export: `write_R2BIN()`

The function write_R2BIN() works very similar but with fewer arguments. Most important is the option version. This allows you, for instance, to import a file of version 3 and export it again in version 8 to be compatible with other software.

## import BIN-file version 3
V3 <- read_BIN2R(file, verbose = FALSE)

## export to version 8, here a temporary file
write_R2BIN(V3, tempfile(), version = "8", txtProgressBar = FALSE)

3.1.3 Export as CSV `convert_BIN2CSV()`

Sometimes R simply isn’t the tool you want to or can’t use. Our a colleague would ask you, “Can you please mail to records as CSV?” The reasons are manifold, luckily R isn’t a closed environment, and the easiest way to exchange curve data is to do so as CSV files because basically, every software can work with these files.

The only tricky part with BIN/BINX files is that we are missing the x-axis data, however, the function convert_BIN2CSV() does the calculation.

output_path <- tempdir()
convert_BIN2CSV(file, path = output_path, verbose = FALSE)
head(list.files(output_path))

## [1] "[[1]]_1_OSL.csv"  "[[1]]_10_OSL.csv" "[[1]]_11_TL.csv"  "[[1]]_12_OSL.csv"
## [5] "[[1]]_13_OSL.csv" "[[1]]_14_TL.csv"

3.2 Merging files

The idea of merging BIN/BINX files is probably self-explanatory. You may have split your measurements into different files on purpose, or you want to combine measurements that had stopped in the middle, and because of it, you had to re-run the sequence and ended up with multiple files. The function of merge_Risoe.BINfileData() takes either file names (or path to files) or object names of files already imported via read_BIN2R(). We can try this with our BIN/BINX file we have imported a few lines above (the object called bin_data).

merge_Risoe.BINfileData(c(bin_data, bin_data))

## 
## [Risoe.BINfileData object]
## 
##  BIN/BINX version      3
##  Object date:          271020, 281020, 291020
##  User:                 Default
##  System ID:            150
##  Overall records:      1584
##  Records type:         IRSL  (n = 72)
##                        OSL   (n = 1008)
##                        TL    (n = 504)
##  Position range:       1 : 72
##  Grain range:          0 : 0
##  Run range:            1 : 8
##  Set range:            3 : 6

The output is another time a Risoe.BINfileData-class object, with a crucial difference: Now, the position number runs from 1 to 72(!). Obviously, if there were a Risø device with a carousel with so many aliquot positions, it is not commercially available. The reason for this recalculation of position is that data analysis is usually carried out based on position numbers. But if we append the new data without taking care of the position numbers, position numbers appear twice (or multiple times).

Such behaviour might be wanted, for instance, if the reason for merging BIN/BINX files was a broken measurement. More likely, however, is that you have measured one sample over, let’s say, two carousels (\(2\times48\) positions), simply because you wanted to increase the number of aliquots. In that case, you want to treat each of the positions unique because they represent individual aliquots, and the merge function takes care of it.

You can control this behaviour by setting the parameter keep.position.number to either FALSE (the default) or TRUE. Additionally, you may have used only every 2^nd or 3^rd position on your sample carousel of the reader. It would not matter for any subsequent analysis but you may want to preserve that information. For this purpose, you can use the argument position.number.append.gap.

3.3 Subsetting of records

Subsetting or selecting particular records from BIN/BINX file is probably the most complicated part. I have seen a couple of times that users first did this with the Analyst before importing the file into R. Well, there is no need for it. One way of selecting data we have already described in Fuchs et al. (2015).

ID <- bin_data@METADATA[bin_data@METADATA$RUN == 1,"ID"]

This call would give you the record identifiers of all records with the attribute run = 1. From there, we could move on to only what we need (see Fuchs et al. 2015 for more details). When the paper was written (not when it was published), this was the way to select records. It was a little bit cumbersome and not really clean, and the way to go later, was to use the RLum objects instead (briefly below).

Because I personally stopped working with BIN/BINX files regularly, and if I would use RLum objects, it took me a while before I realised that people were still trying to select records that way. I could see the beauty. The big table is why selecting records according to their metadata in the Analyst is easy. So why not making it as easy as in the Analyst? The function here is called subset(). This function is around even without the 'Luminescence' package to subset data.frames in R. For the 'Luminescence' package, I added a new method to this function to work with the Risoe.BINfileData objects. In other words, the function subset() works like you expect it from working if you are familiar with base R and handling data.frames. Only here it takes care of the peculiarities of the Risoe.BINfileData objects.

subset(bin_data, RUN == 1)

## 
## [Risoe.BINfileData object]
## 
##  BIN/BINX version      3
##  Object date:          271020
##  User:                 Default
##  System ID:            150
##  Overall records:      108
##  Records type:         OSL   (n = 72)
##                        TL    (n = 36)
##  Position range:       1 : 36
##  Grain range:          0 : 0
##  Run range:            1 : 1
##  Set range:            3 : 6

This is essentially the same selection we did a few lines above. The only difference is that the output is again a Risoe.BINfileData. And it can be done more sophisticated. For example, we could select only records of 10 to 20 with run number > 2.

subset(bin_data, POSITION >= 10 & POSITION <= 30 & RUN > 2)

## 
## [Risoe.BINfileData object]
## 
##  BIN/BINX version      3
##  Object date:          281020, 291020
##  User:                 Default
##  System ID:            150
##  Overall records:      336
##  Records type:         IRSL  (n = 21)
##                        OSL   (n = 210)
##                        TL    (n = 105)
##  Position range:       10 : 30
##  Grain range:          0 : 0
##  Run range:            3 : 8
##  Set range:            3 : 6

This is a quick way of selecting the right curves needed for the analysis. Supported fields are all(!) column names of the METADATA slot (colnames(bin_data@METADATA)).

3.4 Changing the metadata of a record

Sometimes it is necessary to correct the data before we can process them further. A typical example would be the ltype, the type of luminescence. For instance, our imported dataset has three different curve types.

unique(bin_data@METADATA$LTYPE)

## [1] "OSL"  "TL"   "IRSL"

Let’s assume we want to treat the IRSL curves as OSL curves, and therefore we have to rename them first. We can modify them to replace all relevant entries by using base R functionality.

new_bin_data <- bin_data
new_bin_data@METADATA[new_bin_data@METADATA$LTYP == "IRSL", "LTYPE"] <- "OSL"

3.5 Changing records

Changing count values in the record works likewise. For example, to replace all values in record number 5 with the noise we could write:

## set plot panel showing 1 row and 2 columns
par(mfrow = c(1,2))

## plot records as it appears before the replacement
plot(new_bin_data@DATA[[5]], main = "before")

## replace values in record
new_bin_data@DATA[[5]] <- runif(length(new_bin_data@DATA[[5]]))

## plot record after the replacement
plot(new_bin_data@DATA[[5]], main = "after")

3.6 Plotting

It is always a good idea to look at your data before processing them. Of course, we may use standard R functions, such as plot() as shown above. Still, to have all the information about the curve type and the correct axes labelling at hand, it is easier to automatically use a function that does all of it.

## set plot panel (three rows, eight columns) 
par(mfrow = c(3,8))

## plot dataset
plot_Risoe.BINfileData(bin_data, position = 1)

This plotting function does not do much but comes with some handy arguments such as sorter, set to POSITION by default. But it can be set to any other argument to see curves in a different order. Equally interesting might be the option curve.transformation, which converts CW-OSL and CW-IRSL curves into pseudo-LM curves after suggested by Bos and Wallinga (2012).

par(mfrow = c(1,3))
plot_Risoe.BINfileData(
  bin_data,
  position = 1,
  run = 2,
  curve.transformation = "CW2pLMi"
)

The transformation happens on the fly, and of course, the function only transposes curves where such a transformation makes sense. In our case, the TL curve remains untouched.

4 What about RLum-objects?

Last but not least, there is an important aspect I have not mentioned yet: all analysis and calculation functions do not work with Risoe.BINfileData objects, but with something called RLum objects. Detailing the background and purpose of the RLum objects is a tutorial of its own. Here let’s just say that it is easier to work with a unified RLum object structure. Because BIN/BINX files are not the only files that can be processed with 'Luminescence' and every file format is different. Perhaps the last thing you want to do as a user is overthinking file format differences.

Suppose, we have now selected (subset) all curves of interest and want to further work with the data. This means the final import step requires that the Risoe.BINfileData is converted into so-called RLum objects.

data_rlum <- Risoe.BINfileData2RLum.Analysis(bin_data)

The function allows you to set a couple of options, for instance, setting the position number (argument pos). This somewhat duplicates the functionality of subset(). However, subset() came later, so this function’s arguments are mainly leftovers to maintain backward compatibility and is not so powerful.

If you don’t want to work with the Risoe.BINfileData objects at all, you can import your file using the argument fastForward, which does the rest for you.

data_rlum <- read_BIN2R(file, fastForward = TRUE, verbose = FALSE)
data_rlum[[1]]

## 
##  [RLum.Analysis-class]
##   originator: Risoe.BINfileData2RLum.Analysis()
##   protocol: unknown
##   additional info elements:  0
##   number of records: 22
##   .. : RLum.Data.Curve : 22
##   .. .. : #1 OSL | #2 TL | #3 OSL | #4 OSL | #5 TL | #6 OSL | #7 OSL
##   .. .. : #8 TL | #9 OSL | #10 OSL | #11 TL | #12 OSL | #13 OSL | #14 TL
##   .. .. : #15 OSL | #16 OSL | #17 TL | #18 OSL | #19 OSL | #20 TL | #21 OSL
##   .. .. : #22 IRSL

Either way, the result is the same, and you end up in a completely different world, the world of RLum objects. Ready to be used with a lot of functions in the R package 'Luminescence'. Why we need RLum objects and how we can efficiently process them. Well, this is stuff for another tutorial.

5 Some final remarks

I have heard a few times (indeed, only a few times) that it would be nice if read_BIN2R() and write_BIN2R() were to work faster. Well, indeed and my two thoughts to it: (1) If you buy a faster computer, also the import will be faster, (2) writing the two functions in C/C++ would bring a tremendous speed boost. The only problem with it is that it might not work equally nice on all platforms without much effort. You can run R and 'Luminescence' on Windows, Linux or macOS and still import your BIN/BINX files in the same way. I believe that this advantage outweighs the slower import speed.

Last, if you feel something is missing in this tutorial, please write me an email.

References

Bos, Adrie J J, and Jakob Wallinga. 2012. “How to Visualize Quartz OSL Signal Components.” Radiation Measurements 47 (9): 752–58. https://doi.org/10.1016/j.radmeas.2012.01.013.

Duller, G A T. 2015. “The Analyst Software Package for Luminescence Data: Overview and Recent Improvements.” Edited by Regina DeWitt. Ancient TL 33 (1): 35–42.

Fuchs, Margret C, Sebastian Kreutzer, Christoph Burow, Michael Dietze, Manfred Fischer, Christoph Schmidt, and Markus Fuchs. 2015. “Data Processing in Luminescence Dating Analysis: An Exemplary Workflow Using the R Package ‘Luminescence’.” Quaternary International 362: 8–13. https://doi.org/10.1016/j.quaint.2014.06.034.

Meszner, Sascha, Sebastian Kreutzer, Markus Fuchs, and Dominik Faust. 2013. “Late Pleistocene Landscape Dynamics in Saxony, Germany: Paleoenvironmental Reconstruction Using Loess-Paleosol Sequences.” Quaternary International 296 (May): 95–107. https://doi.org/10.1016/j.quaint.2012.12.040.