Context Scenes
This page describes the latest version of ContextScene file format (version 6).
A ContextScene is a metadata file designed to manipulate raw reality data like photos, maps, meshes and point cloud. It also stores extra metadata on these reality data, like photo position, detected objects, etc. Though not mandatory, the following assumes that you are familiar with the Reality Management Service and its Reality Management API. If not the case, consider Reality Management Service as a set of directories, referred to as entries, with a universally unique identifier (UUID). Depending on what they store, these entries have a type, ContextScene being one of them. Other types involved here will be images (CCImageCollection), meshes (3SM, 3MX) and point clouds (LAS, LAZ, OPC, PointCloud). Check existing types on Reality Data Properties's page.
A ContextScene is persisted as an Json file. Name the file ContextScene.json
when storing in Reality Management Service.
1) Photos
To refer to some of the photos in a directory and not all of them, a ContextScene adds one entry per photo. Each photo has a unique id which is used to refer to the given photo in other parts of the ContextScene file. See the examples below.
{
"version": "6.0",
"PhotoCollection": {
"Photos": {
"0": {
"ImagePath": "0:IMAGE_1059.JPG"
},
"1": {
"ImagePath": "0:IMAGE_1060.JPG"
},
"2": {
"ImagePath": "0:IMAGE_1061.JPG"
}
}
},
"References": {
"0": {
"Path": "Q:/DataSets/Motos/Images"
}
}
}
Note that the complete path is obtained through a set of references, the prefix n:
standing for the reference path of id n
. Hence, the first photo above is in Q:\DataSets\Motos\Images\IMAGE_1059.JPG
. This mechanism allows easy relocation of the raw reality data, in particular when uploading them to a cloud repository. For example, assume the photos in Q:\DataSets\Motos\Images
have been uploaded to Reality Management Service in a CCImageCollection entry of uuid 7c00e184-5913-423b-8b4c-840ceb4bf616
, then you just have to modify the above ContextScene this way to point to the Reality Management Service version of the photos:
...
"References" : {
"0" : {
"Path" : "rds:7c00e184-5913-423b-8b4c-840ceb4bf616"
}
}
...
2) Photos with positions
It is sometimes required to indicate where the photos are taken from (for 3D search) or even in which direction and with which exact camera (for 2D to 3D mapping). Position and orientation are stored using Pose, while the camera parameters use a Device. Here is an example where the photos are geo-localized. In this case, we also need a Spatial Reference System (SRS). SRS are given through a definition string (EPSG, WKT, ...) and referred via their id. All the poses are given in a common SRS. Here is a sample:
{
"version" : "6.0",
"SpatialReferenceSystems" : {
"1" : {
"Definition" : "EPSG:32635"
}
},
"PhotoCollection" : {
"SRSId" : 1,
"Poses" : {
"3" : {
"Center" : {
"x" : 388620.561400515,
"y" : 6059714.92091526,
"z" : 252.657253024168
}
}
}
},
"4" : {
"Center" : {
"x" : 388624.256090598,
"y" : 6059693.61938892,
"z" : 253.549586427398
}
}
}
},
"5" : {
"Center" : {
"x" : 388614.899373967,
"y" : 6059749.28043899,
"z" : 252.456919016317
}
}
}
}
},
"Photos" : {
"3" : {
"ImagePath" : "0:image_1.JPG",
"PoseId" : 3
},
"4" : {
"ImagePath" : "0:image_2.JPG",
"PoseId" : 4
},
"5" : {
"ImagePath" : "0:image_3.JPG",
"PoseId" : 5
}
}
},
"References" : {
"0" : {
"Path" : "Q:/Analyze/TrainingScenes/Datasets/Example/city"
}
}
}
3) Photos with orientations
In this case, the directions in which the photos are taken are provided, using a matrix. The camera parameters are also given using a device entry. Jobs relying on 2D+3D reasoning usually involve such advanced ContextScenes (drones, mobile mapping, etc.). Check the Reality Modeling User Guide for concepts and norms used by Bentley Systems. In the following example, 3 photos are taken with the same camera. Poses are not geo-referenced; a local system is used (no SRS)
{
"version": "6.0",
"PhotoCollection": {
"Devices": {
"0": {
"Type": "perspective",
"Dimensions": {
"width": 5472,
"height": 3648
},
"PrincipalPoint": {
"x": 2718.83277672126,
"y": 1826.98620377713
},
"FocalLength": 2174.43172433616,
"RadialDistortion": {
"k1": -0.0135233892956603,
"k2": 0.00403860548497617,
"k3": -0.000308785047808229
},
"TangentialDistortion": {
"p1": -0.0014916349534087,
"p2": -0.000189437237012201
},
"AspectRatio": 1,
"Skew": 0
}
},
"Poses": {
"0": {
"Center": {
"x": -1.62595567955807,
"y": -0.080899284124208,
"z": 0.00756484596716649
},
"Rotation": {
"omega": -2.944051582136,
"phi": 0.225134191045904,
"kappa": -0.0446515192166179
}
},
"1": {
"Center": {
"x": -0.184453182366224,
"y": -0.00623637496469653,
"z": -0.020190179737657
},
"Rotation": {
"omega": -2.98523877071878,
"phi": 0.231294571221302,
"kappa": -0.0405997858706959
}
},
"2": {
"Center": {
"x": 1.8104088619243,
"y": 0.0871356590889045,
"z": 0.0126253337704903
},
"Rotation": {
"omega": -2.96497834304822,
"phi": 0.0627398648125152,
"kappa": -0.0538948521249639
}
}
},
"Photos": {
"0": {
"ImagePath": "0:IMG_1059.JPG",
"DeviceId": 0,
"PoseId": 0
},
"1": {
"ImagePath": "0:IMG_1060.JPG",
"DeviceId": 0,
"PoseId": 1
},
"2": {
"ImagePath": "0:IMG_1061.JPG",
"DeviceId": 0,
"PoseId": 2
}
}
},
"References": {
"0": {
"Path": "Q:/Analyze/DataSets/Motos/images"
}
}
}
4) Orthophoto
An orthophoto, or a map, is an aerial photograph that has been geometrically corrected or 'ortho-rectified' such that the scale of the photograph is uniform and utilized in the same manner as a map. It is usually split into tiles, each of them having a 2D location in a given SRS. Introducing a specific type of device, an orthotile, to describe scale parameters and adding 2D location coordinates to photos, we simply extend the ContextScene format to describe an orthophoto. Here is a sample where tiles are 3200x4800 images and the resolution is 7.5cm per pixel. The location is the position of the upper-left corner pixel of the image. Note that the y axis of the image pixels goes down toward the south. Being in the northern hemisphere, adding 1 to the pixel y coordinates corresponds to removing 7.5cm to the geographic y coordinate, ending with a negative pixel height of -0.075. We also provide a nodata
value for pixels where the orthophoto is unknown. Such parameters are common and usually found in the geotiff files or .tfw sister files.
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:2193"
}
},
"PhotoCollection": {
"SRSId": 0,
"Devices": {
"0": {
"Type": "orthotile",
"Dimensions": {
"width": 3200,
"height": 4800
},
"PixelSize": {
"Width": 0.075,
"Height": -0.075
}
}
},
"Photos": {
"0": {
"ImagePath": "0:Production_39_ortho_part_1_1.tif",
"DeviceId": 0,
"Location": {
"UlX": 533418,
"UlY": 5212669.912
}
},
"1": {
"ImagePath": "0:Production_39_ortho_part_1_2.tif",
"DeviceId": 0,
"Location": {
"UlX": 533712.912,
"UlY": 5212669.912
}
},
"2": {
"ImagePath": "0:Production_39_ortho_part_2_1.tif",
"DeviceId": 0,
"Location": {
"UlX": 533418,
"UlY": 5212824.064
}
}
}
},
"References": {
"0": {
"Path": "Q:/Datasets/Graz/ortho"
}
}
}
5) Orthophoto with height
Some orthophotos, often called ortho DSM, provide not only the color of the map at a given position, but also its height. This DSM parameter is given with another tile image with floating pixel values. We simply add a DepthPath
entry to the photos. Though not documented above, this depth also makes sense for usual photos when the sensor provides a depth (e.g., an iPhone13 Pro), hence the tag's name.
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:2193"
}
},
"PhotoCollection": {
"SRSId": 0,
"Devices": {
"0": {
"Type": "orthotile",
"Dimensions": {
"width": 4096,
"height": 4096
},
"PixelSize": {
"Width": 0.072,
"Height": -0.072
},
"NoData": -9999
}
},
"Photos": {
"0": {
"ImagePath": "0:Production_39_ortho_part_1_1.tif",
"DeviceId": 0,
"Location": {
"UlX": 533418,
"UlY": 5212669.912
},
"DepthPath": "1:Production_39_DSM_part_1_1.tif"
},
"1": {
"ImagePath": "0:Production_39_ortho_part_1_2.tif",
"DeviceId": 0,
"Location": {
"UlX": 533712.912,
"UlY": 5212669.912
},
"DepthPath": "1:Production_39_DSM_part_1_2.tif"
},
"2": {
"ImagePath": "0:Production_39_ortho_part_2_1.tif",
"DeviceId": 0,
"Location": {
"UlX": 533418,
"UlY": 5212824.064
},
"DepthPath": "1:Production_39_DSM_part_2_1.tif"
}
}
},
"References": {
"0": {
"Path": "Q:/Analyze/TrainingScenes/Datasets/Example/S2D-Orthos/Graz/ortho"
},
"1": {
"Path": "Q:/Analyze/TrainingScenes/Datasets/Example/S2D-Orthos/Graz/dsm"
}
}
}
6) Meshes
Meshes are given in a MeshCollection
section. 3SM and 3MX formats are currently supported. When geo-referenced, these formats store their SRS on their side, so that providing SRS in the ContextScene file is not mandatory. If you set SRS in the ContextScene, which can be done for the whole mesh collection or per mesh, it overrides the one in the mesh. The same is true for bounding boxes. See the following two examples:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:25829"
}
},
"MeshCollection": {
"SRSId": 0,
"Meshes": {
"0": {
"Name": "",
"Path": "0:Production_1_3SM.3sm"
}
}
},
"References": {
"0": {
"Path": "Q:/Dataset/Bridge"
}
}
}
{
"version": "6.0",
"MeshCollection": {
"Meshes": {
"0": {
"Name": "",
"Path": "0:Production_1.3mx",
"BoundingBox": {
"xmin": -100,
"ymin": -120,
"zmin": 10,
"xmax": 100,
"ymax": 80,
"zmax": 35
}
}
}
},
"References": {
"0": {
"Path": "Q:/Dataset/Bridge"
}
}
}
7) Point Clouds
Similarly, point clouds are given in a PointCloudCollection
section. Usual formats are supported in a ContextScene file, as well as OPC and POD. Neither SRS nor BoundingBox are read from the point cloud files themselves, so you have to provide one in the ContextScene, if needed. Here are some examples:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:25829"
}
},
"PointCloudCollection": {
"SRSId": 0,
"PointClouds": {
"0": {
"Name": "",
"BoundingBox": {
"xmin": 758968.81799577,
"ymin": 4066931.70199941,
"zmin": 25.9644992494779,
"xmax": 759084.955992652,
"ymax": 4067043.41599757,
"zmax": 50.4674979874532
},
"Path": "0:point_cloud.opc"
}
}
},
"References": {
"0": {
"Path": "Q:/DataSets/Spain/OPC"
}
}
}
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:25829"
}
},
"PointCloudCollection": {
"SRSId": 0,
"PointClouds": {
"0": {
"Name": "",
"Path": "0:spain.las"
}
}
},
"References": {
"0": {
"Path": "Q:/DataSets/Spain/LAS"
}
}
}
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "ENU:36.7127,-6.10034"
}
},
"PointCloudCollection": {
"SRSId": 0,
"PointClouds": {
"0": {
"Name": "",
"Path": "0:PointCloud.pod"
}
}
},
"References": {
"0": {
"Path": "rds:7c00e184-5913-423b-8b4c-840ceb4bf616"
}
}
}
Please note the ENU:LATITUDE,LONGITUDE SRS definition used by the above example is not standard. You might encounter this SRS definition in files generated by Reality Modeling (check Reality Modeling User Guide).
It is also possible to use PointCloudCollection
section for using point clouds in Reality Modeling, either static or mobile. Supported format are E57, LAS and PTX. The bounding box property is not needed in that case, but other properties are introduced so that it is possible to import point clouds in Reality Modeling.
Here's an example for a static E57 file with known location within the file:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:4978"
}
},
"PointCloudCollection": {
"SRSId": 0,
"PointClouds": {
"0": {
"Name": "pump",
"Path": "0:pump.e57",
"Type": "Static",
"Location": "InFile"
}
}
},
"References": {
"0": {
"Path": "rds:1a043cb1-c8d8-4af9-9b3a-10ba8cd47800"
}
}
}
Now, an example for a LAS file with scanner location provided in the scene:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:4978"
}
},
"PointCloudCollection": {
"SRSId": 0,
"PointClouds": {
"0": {
"Name": "pump",
"Path": "0:pump.las",
"Type": "Static",
"Location": "Center",
"Center": {
"x": 12.5,
"y": 20.3,
"z": 5.4
}
}
}
},
"References": {
"0": {
"Path": "rds:1a043cb1-c8d8-4af9-9b3a-10ba8cd47800"
}
}
}
And finally, an example for a LAS file with unknown scanner location:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:4978"
}
},
"PointCloudCollection": {
"SRSId": 0,
"PointClouds": {
"0": {
"Name": "pump",
"Path": "0:pump.las",
"Type": "Static",
"Location": "Unknown"
}
}
},
"References": {
"0": {
"Path": "rds:1a043cb1-c8d8-4af9-9b3a-10ba8cd47800"
}
}
}
For mobile point clouds, it is necessary to provide a TrajectoryId. See for example:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:4978"
}
},
"PointCloudCollection": {
"SRSId": 0,
"PointClouds": {
"0": {
"Name": "mobile",
"Path": "0:mobile.las",
"Type": "Mobile",
"SRSId": 0,
"TrajectoryId": 0
}
}
},
"TrajectoryCollection": {
"SRSId": 0,
"Trajectories": {
"0": {
"Paths": ["0:traj_1.csv", "0:traj_2.csv"],
"Delimiters": [" ", ","],
"CombineConsecutiveDelimiters": true,
"DecimalSeparator": ".",
"LinesToIgnore": 1,
"TimeColumnId": 0,
"XColumnId": 1,
"YColumnId": 2,
"ZColumnId": 3
}
}
},
"References": {
"0": {
"Path": "rds:1a043cb1-c8d8-4af9-9b3a-10ba8cd47800"
}
}
}
The TrajectoryId
will point to an item in TrajectoryCollection
. In such an item, you can point the trajectory files to use, and the options for parsing the trajectory files.
Note that DecimalSeparator
can be either .
or ,
. Delimiters have to be only composed of one character. And column ids have to be all different from each other.
8) Annotations
So far, we have seen how to refer to raw reality data or to add extra metadata information like localization. The next section is specific to Reality Analysis and is usually filled by the analysis or manually annotated by user for training purposes. It consists of an AnnotationCollection which specify Labels and one or several Annotations. A label is a class of what has been detected: a traffic sign, a power line, ground, vegetation, etc. An Annotation specifies what has been detected and where. This could be a position, a geometric shape, every pixel of an image, every point of a point cloud, etc.
Available Annotation types
2D objects
Objects are detected in photos as boxes aligned with the axis. In this case, the Annotation specifies a set of boxes (coordinates and label) in every photo. Coordinates are relative to the photo size between 0 and 1. A confidence for the detected object might be provided between 0 and 1. Note that the objects have an UUID which is specific to the photo in which it has been detected: two objects cannot have the same UUID. Here is a small example:
{
"version": "6.0",
"PhotoCollection": {
"Photos": {
"0": {
"ImagePath": "0:img_1059.JPG"
},
"1": {
"ImagePath": "0:img_1060.JPG"
},
"2": {
"ImagePath": "0:img_1061.JPG"
}
}
},
"AnnotationCollection": {
"Labels": {
"3": {
"Name": "car"
},
"4": {
"Name": "motorcycle"
}
},
"Annotations": [
{
"Type": "Objects2D",
"Objects": {
"0": {
"e256276a-d478-476d-a654-195dbc23d1d9": {
"LabelInfo": {
"Confidence": 0.998534977436066,
"LabelId": 3
},
"Box2D": {
"xmin": 0.0319100245833397,
"ymin": 0.537032723426819,
"xmax": 0.374318659305573,
"ymax": 0.66499537229538
}
},
"03f2b2e5-5df5-49a0-8515-62d6df7739bd": {
"LabelInfo": {
"Confidence": 0.996562480926514,
"LabelId": 3
},
"Box2D": {
"xmin": 0.877566039562225,
"ymin": 0.4940065741539,
"xmax": 1,
"ymax": 0.62068098783493
}
}
},
"1": {
"b61c36dd-98d2-4a2a-902e-9d12d9472f15": {
"LabelInfo": {
"Confidence": 0.983914494514465,
"LabelId": 3
},
"Box2D": {
"xmin": 0.854629874229431,
"ymin": 0.483299434185028,
"xmax": 0.938638925552368,
"ymax": 0.547508895397186
}
},
"71eab044-afc0-45e3-a9bf-d9c96689db58": {
"LabelInfo": {
"Confidence": 0.924463272094727,
"LabelId": 3
},
"Box2D": {
"xmin": 0.98187381029129,
"ymin": 0.479612439870834,
"xmax": 1,
"ymax": 0.551942944526672
}
}
},
"2": {
"fbd8a303-06de-4368-910d-080c8a9024fc": {
"LabelInfo": {
"Confidence": 0.983137726783752,
"LabelId": 3
},
"Box2D": {
"xmin": 0.943033456802368,
"ymin": 0.504623711109161,
"xmax": 0.998084306716919,
"ymax": 0.579438984394073
}
}
}
}
}
]
},
"References": {
"0": {
"Path": "Q:/Analyze/DataSets/Images"
}
}
}
2D segmentation
To store one label per pixel, what is often called Semantic segmentation, we use a PNG image file. Each pixel is a 16bit unsigned integer set to the label of the corresponding pixel in the annotated photo. The value 65535 is reserved for pixels where the label is unknown. This value is generally used when the ContextScene is the result of a manual annotation: regions where the annotation should be ignored for the Machine Learning training are set to 65535. The PNG files are stored in a separate folder identified with the reference path prefixed in front of the mask filename. Here is an example:
{
"version": "6.0",
"PhotoCollection": {
"Photos": {
"0": {
"ImagePath": "0:img_1059.JPG"
},
"1": {
"ImagePath": "0:img_1060.JPG"
},
"2": {
"ImagePath": "0:img_1061.JPG"
}
}
},
"AnnotationCollection": {
"Labels": {
"0": {
"Name": "background"
},
"2": {
"Name": "bicycle"
},
"6": {
"Name": "bus"
},
"7": {
"Name": "car"
}
},
"Annotations": [
{
"Type": "Segmentation2D",
"Segmentations": {
"0": {
"Path": "1:0.png"
},
"1": {
"Path": "1:1.png"
},
"2": {
"Path": "1:2.png"
}
}
}
]
},
"References": {
"0": {
"Path": "Q:/Analyze/DataSets/Motos/images"
},
"1": {
"Path": "Q:/Analyze/DataSets/Motos/segmentedPhotos"
}
}
}
On certain occasions, it may be useful to reassign certain 16bit integers in PNG files to certain classes. This avoids the need to create new PNG files when labels have to be renumbered, for instance to make two different annotations using the same labels. For this purpose, a LabelMap entry is provided. Here is an example of an annotation where integers 3, 4 and 15 in the PNG files are remapped, while the others integers are left unchanged.
{
...
"Labels" : {
"0" : {
"Name" : "background"
},
"1" : {
"Name" : "aeroplane"
},
"86" : {
"Name" : "bird"
},
"87" : {
"Name" : "boat"
},
"5" : {
"Name" : "bottle"
},
"88" : {
"Name" : "person"
},
"16" : {
"Name" : "pottedplant"
}
},
"Annotations" : [
{
"Type" : "Segmentation2D",
"LabelMap" : {
"3" : 86,
"4" : 87,
"15" : 88
},
"Segmentations" : {
"0" : {
"Path" : "1:0.png"
},
"1" : {
"Path" : "1:1.png"
...
2D segmentation in orthophotos
The same type of segmentation applies to orthophoto:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:2193"
}
},
"PhotoCollection": {
"SRSId": 0,
"Devices": {
"0": {
"Type": "orthotile",
"Dimensions": {
"width": 632,
"height": 2000
},
"PixelSize": {
"Width": 0.1,
"Height": -0.1
},
"NoData": -9999
}
},
"Photos": {
"0": {
"ImagePath": "0:img_NoMerge_ortho_part_1_2.tif",
"DeviceId": 0,
"Location": {
"UlX": 533550.937633621,
"UlY": 5212434.93763362
},
"DepthPath": "0:img_NoMerge_DSM_part_1_2.tif"
},
"1": {
"ImagePath": "0:img_NoMerge_ortho_part_1_1.tif",
"DeviceId": 0,
"Location": {
"UlX": 533350.937633621,
"UlY": 5212434.93763362
},
"DepthPath": "0:img_NoMerge_DSM_part_1_1.tif"
}
}
},
"AnnotationCollection": {
"Labels": {
"0": {
"Name": "background"
},
"2": {
"Name": "Building"
},
"4": {
"Name": "Car"
}
},
"Annotations": [
{
"Type": "Segmentation2D",
"Segmentations": {
"0": {
"Path": "1:img_NoMerge_ortho_part_1_2_mask_.png"
},
"1": {
"Path": "1:img_NoMerge_ortho_part_1_1_mask_.png"
}
}
}
]
},
"References": {
"0": {
"Path": "Q:/Analyze/DataSets/Graz/OrthophotoImages"
},
"1": {
"Path": "Q:/Analyze/DataSets/Graz/segmentedPhotos"
}
}
}
Here again, a LabelMap can be specified.
3D objects
3D objects are described as 3D boxes. These boxes are given by a range in every direction. An optional rotation might be specified if aligning the box with the axis is too restrictive. It is given as a 3x3 matrix and corresponds to a rotation centered at the center of the box: point [(xmin+xmax)/2,(ymin+ymax)/2,(zmin+zmax)/2] remains the central position of the object. An optional SRS is provided for geo-referenced scenes. Here is an example:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "ENU:36.71339,-6.10019"
}
},
"AnnotationCollection": {
"Labels": {
"30": {
"Name": "light signal"
},
"29": {
"Name": "pole"
},
"28": {
"Name": "manhole"
}
},
"Annotations": [
{
"Type": "Objects3D",
"SRSId": 0,
"Objects": {
"85b8dbe8-ba72-4f51-9343-f8763a2d7d77": {
"LabelInfo": {
"LabelId": 28
},
"RotatedBox3D": {
"Box3D": {
"xmin": -18.2450134852094,
"ymin": -33.4309299351174,
"zmin": 35.994338294097,
"xmax": -17.6879619720465,
"ymax": -32.9339199858639,
"zmax": 36.0384178477738
},
"Rotation": {
"M_00": 0.923586176565216,
"M_01": -0.38339088989913,
"M_02": 0,
"M_10": 0.38339088989913,
"M_11": 0.923586176565216,
"M_12": 0,
"M_20": 0,
"M_21": 0,
"M_22": 1
}
}
},
"ffb2b009-6d22-4b1b-a9bd-9232f7f762d6": {
"LabelInfo": {
"LabelId": 28
},
"RotatedBox3D": {
"Box3D": {
"xmin": -27.1921100305691,
"ymin": -25.0032024072245,
"zmin": 36.0454853671103,
"xmax": -26.5712067997615,
"ymax": -24.6983481328806,
"zmax": 36.0555013530683
},
"Rotation": {
"M_00": -0.245754837557398,
"M_01": 0.969332017327983,
"M_02": 0,
"M_10": -0.969332017327983,
"M_11": -0.245754837557398,
"M_12": 0,
"M_20": 0,
"M_21": 0,
"M_22": 1
}
}
},
"fe96357c-ca3e-4548-b676-2b4a167f5825": {
"LabelInfo": {
"LabelId": 31
},
"RotatedBox3D": {
"Box3D": {
"xmin": -4.32935880992287,
"ymin": -15.5213085471863,
"zmin": 36.6455288658342,
"xmax": -3.92620580047086,
"ymax": -15.178094920882,
"zmax": 36.7824782388701
},
"Rotation": {
"M_00": 0.0130315651698905,
"M_01": 0.99991508554938,
"M_02": 0,
"M_10": -0.99991508554938,
"M_11": 0.0130315651698905,
"M_12": 0,
"M_20": 0,
"M_21": 0,
"M_22": 1
}
}
}
}
}
]
}
}
3D segmentation
To store one label per point in a point cloud, what is often called Semantic segmentation, we use the OPC format. This format with Level Of Details is designed to be streamed and efficiently displayed. Each point is decorated with a 16bit unsigned integer sets to its label. Like for 2D segmentation, the value 65535 is reserved for points where the label is unknown. Like for 2D segmentation, a LabelMap can be specified to remap integers in the OPC file.
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:26910"
}
},
"AnnotationCollection": {
"Labels": {
"21": {
"Name": "Vehicle"
},
"6": {
"Name": "Roof"
},
"5": {
"Name": "Tree"
}
},
"Annotations": [
{
"Type": "Segmentation3D",
"SRSId": 0,
"Path": "0:PointCloud.opc"
}
]
},
"References": {
"0": {
"Path": "E:/Dataset/Graz/segmentedPointCloud"
}
}
}
2D lines
The following annotations describe a set of lines in a plane, result of the analysis of an orthophoto, it might be cracks over a road, a road network, rails, etc. We call them lines though the topology required to describe these entities could be as complex as a graph with junctions, loops, etc. Hence, a Line2D is here given by a set of 2D vertices, and a set of segments between these vertices. To describe the "thickness" of the original entity which has been vectorized into a line, we use the mathematically sounded concept of diameter. It is the diameter of the largest disc centered at a given vertex containing only points of the entity. Here is a small example:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:32615"
}
},
"AnnotationCollection": {
"Labels": {
"0": {
"Name": "background"
},
"1": {
"Name": "crack"
}
},
"Annotations": [
{
"Type": "Lines2D",
"SRSId": 0,
"Lines": {
"9c508e50-476d-4585-8a4b-d5885a652c8d": {
"LabelInfo": {
"LabelId": 1
},
"Length": 2.12,
"MeanDiameter": 0.04,
"MaxDiameter": 0.0762712781246574,
"Vertices": {
"0": {
"Position": {
"x": 479868.86,
"y": 4980803.44157715
},
"Diameter": 0.0202617157732701
},
"1": {
"Position": {
"x": 479868.712150948,
"y": 4980803.47680339
},
"Diameter": 0.0664043075900635
},
"2": {
"Position": {
"x": 479868.341404493,
"y": 4980803.67640449
},
"Diameter": 0.0686980153798615
},
"3": {
"Position": {
"x": 479868.468095477,
"y": 4980803.63543635
},
"Diameter": 0.0762712781246574
}
},
"Segments": [
{
"VertexId1": 0,
"VertexId2": 1
},
{
"VertexId1": 1,
"VertexId2": 3
},
{
"VertexId1": 2,
"VertexId2": 3
}
]
},
"cf309ac4-5441-46d8-8fb5-b25dd85b3c35": {
"LabelInfo": {
"LabelId": 1
},
"Length": 1.12,
"MeanDiameter": 0.04,
"MaxDiameter": 0.0908217292202603,
"Vertices": {
"0": {
"Position": {
"x": 479874.427917961,
"y": 4980802.47
},
"Diameter": 0.0185410197570828
},
"1": {
"Position": {
"x": 479874.28053349,
"y": 4980802.75
},
"Diameter": 0.0339112355266864
},
"2": {
"Position": {
"x": 479874.373927691,
"y": 4980802.67226102
},
"Diameter": 0.0978553816042589
},
"3": {
"Position": {
"x": 479874.485410865,
"y": 4980802.89236351
},
"Diameter": 0.0908217292202603
},
"4": {
"Position": {
"x": 479874.37,
"y": 4980802.64363961
},
"Diameter": 0.0899999999783128
},
"5": {
"Position": {
"x": 479874.418415482,
"y": 4980802.71591548
},
"Diameter": 0.0813172786177372
},
"6": {
"Position": {
"x": 479874.83328093,
"y": 4980803.53223969
},
"Diameter": 0.0816060783607562
},
"7": {
"Position": {
"x": 479874.612768291,
"y": 4980803.11426725
},
"Diameter": 0.086105905882709
},
"8": {
"Position": {
"x": 479874.677758315,
"y": 4980803.21801663
},
"Diameter": 0.0872066516702332
},
"9": {
"Position": {
"x": 479874.55828698,
"y": 4980802.99093257
},
"Diameter": 0.0900163930024753
},
"10": {
"Position": {
"x": 479874.755737061,
"y": 4980803.36709735
},
"Diameter": 0.0900001483887829
},
"11": {
"Position": {
"x": 479874.456513669,
"y": 4980802.78786825
},
"Diameter": 0.0969726626446287
},
"12": {
"Position": {
"x": 479874.714993671,
"y": 4980803.31748734
},
"Diameter": 0.109567331227317
}
},
"Segments": [
{
"VertexId1": 0,
"VertexId2": 4
},
{
"VertexId1": 1,
"VertexId2": 2
},
{
"VertexId1": 2,
"VertexId2": 5
},
{
"VertexId1": 2,
"VertexId2": 4
},
{
"VertexId1": 3,
"VertexId2": 9
},
{
"VertexId1": 3,
"VertexId2": 11
},
{
"VertexId1": 5,
"VertexId2": 11
},
{
"VertexId1": 6,
"VertexId2": 10
},
{
"VertexId1": 7,
"VertexId2": 9
},
{
"VertexId1": 7,
"VertexId2": 8
},
{
"VertexId1": 8,
"VertexId2": 12
},
{
"VertexId1": 10,
"VertexId2": 12
}
]
}
}
}
]
}
}
3D lines
The following annotations describe a set of lines in space. It might be cracks on a bridge, power-lines, rails, etc. Like in 2D, we call them lines but the topology required to describe these entities could be as complex as a graph with junctions, loops, etc. Hence, a Line3D is here given by a set of 3D vertices, and a set of segments between these vertices. If required, to describe the "thickness" of the original entity which has been vectorized into a line, we use the mathematically sounded concept of diameter. It is the diameter of the largest ball centered at a given vertex containing only points of the entity. Here is a small example:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:32615"
}
},
"AnnotationCollection": {
"Labels": {
"1": {
"Name": "crack"
}
},
"Annotations": [
{
"Type": "Lines3D",
"SRSId": 0,
"Lines": {
"4c71ed7b-00fc-4591-8d36-7abe896937c7": {
"LabelInfo": {
"LabelId": 1
},
"Length": 1.12,
"MeanDiameter": 0.05,
"MaxDiameter": 0.0813741514791665,
"Vertices": {
"0": {
"Position": {
"x": 5.12539769932528,
"y": -0.228260537736713,
"z": 8.04510725583266
},
"Diameter": 0.0758896857238085
},
"1": {
"Position": {
"x": 5.09693810098787,
"y": -0.254084667116378,
"z": 8.02793394946929
},
"Diameter": 0.075647846094568
},
"2": {
"Position": {
"x": 4.94279933680971,
"y": -0.541653249313817,
"z": 7.83193508145411
},
"Diameter": 0.0813741514791665
}
},
"Segments": [
{
"VertexId1": 0,
"VertexId2": 1
},
{
"VertexId1": 1,
"VertexId2": 2
}
]
},
"141d074f-6446-450d-af8f-47c30dcc0493": {
"LabelInfo": {
"LabelId": 1
},
"Length": 2.12,
"MeanDiameter": 0.071,
"MaxDiameter": 0.0861954450358527,
"Vertices": {
"0": {
"Position": {
"x": 0.367889359573153,
"y": -1.83500784023564,
"z": 4.21492132690978
},
"Diameter": 0.0861954450358527
},
"1": {
"Position": {
"x": 0.419774974444863,
"y": -1.98105014611317,
"z": 4.21475624160097
},
"Diameter": 0.0673995244878007
}
},
"Segments": [
{
"VertexId1": 0,
"VertexId2": 1
}
]
}
}
}
]
}
}
2D polygons
The following annotations describe a set of 2D polygons in space. It might be building contours, regions with rust, etc. A Polygon2D is defined by a set of 2D vertices and a set of closed lines, at least one outer boundary, and optionally several inner boundaries. Here is a small example:
{
"version": "6.0",
"SpatialReferenceSystems": {
"0": {
"Definition": "EPSG:32615"
}
},
"AnnotationCollection": {
"Labels": {
"0": {
"Name": "background"
},
"1": {
"Name": "roof"
}
},
"Annotations": [
{
"Type": "Polygons2D",
"SRSId": 0,
"Polygons": {
"3170f694-fceb-4a30-8407-9c4b8b995c7a": {
"LabelInfo": {
"LabelId": 1
},
"Vertices": {
"0": {
"Position": {
"x": 1569048.90925996,
"y": 5181354.7953857
}
},
"1": {
"Position": {
"x": 1569046.03675745,
"y": 5181359.98194182
}
},
"2": {
"Position": {
"x": 1569039.95559443,
"y": 5181360.00584799
}
},
"3": {
"Position": {
"x": 1569039.9138761,
"y": 5181351.26758661
}
}
},
"OuterBoundary": {
"VertexIds": [0, 1, 2, 3]
},
"Area": 1.34
},
"deff5dbe-1324-487f-aa54-59adaf3c27a1": {
"LabelInfo": {
"LabelId": 1
},
"Vertices": {
"0": {
"Position": {
"x": 1569161.77267965,
"y": 5181355.08644351
}
},
"1": {
"Position": {
"x": 1569163.69190289,
"y": 5181351.55282655
}
},
"2": {
"Position": {
"x": 1569166.375,
"y": 5181350.625
}
},
"3": {
"Position": {
"x": 1569166.45,
"y": 5181350.625
}
},
"4": {
"Position": {
"x": 1569172.09739531,
"y": 5181344.35647846
}
},
"5": {
"Position": {
"x": 1569178.75182,
"y": 5181347.82391931
}
},
"6": {
"Position": {
"x": 1569175.69844311,
"y": 5181353.75815953
}
},
"7": {
"Position": {
"x": 1569179.34246425,
"y": 5181353.40052368
}
},
"8": {
"Position": {
"x": 1569191.35,
"y": 5181358.95
}
},
"9": {
"Position": {
"x": 1569191.35,
"y": 5181359.025
}
},
"10": {
"Position": {
"x": 1569173.575,
"y": 5181359.175
}
},
"11": {
"Position": {
"x": 1569172.14475312,
"y": 5181359.99780775
}
},
"12": {
"Position": {
"x": 1569163.10653567,
"y": 5181360.00136519
}
},
"13": {
"Position": {
"x": 1569164.47593204,
"y": 5181357.2323884
}
}
},
"OuterBoundary": {
"VertexIds": [0, 1, 2, 3, 4, 5]
},
"InnerBoundaries": [
{
"VertexIds": [6, 7, 8, 9]
},
{
"VertexIds": [10, 11, 12, 13]
}
],
"Area": 2.14
}
}
}
]
}
}
3D polygons
The following annotations describe a set of 3D polygons in space. It might be regions with rust, etc. A Polygon3D is defined by a set of 3D vertices and a set of closed curved lines, at least one outer boundary, and optionally several inner boundaries. Here is a small example:
{
"version" : "6.0",
"SpatialReferenceSystems" : {
"0" : {
"Definition" : "ENU:44.94515,-93.08863"
}
},
"AnnotationCollection" : {
"Labels" : {
"1" : {
"Name" : "spalling"
}
},
"Annotations" : [
{
"Type" : "Polygons3D",
"SRSId" : 0,
"Polygons" : {
"5def1f38-2f81-4733-93cc-6acf40d765d9" : {
"LabelInfo" : {
"LabelId" : 1
},
"Vertices" : {
"0" : {
"Position" : {
"x" : -12.8350982666016,
"y" : -1.28830945491791,
"z" : 199.686065673828
}
},
...
},
"OuterBoundary" : {
"VertexIds" : [
0,
1,
...
]
},
"Area" : 0.00158626517052999,
"Depth" : 0.00782687733284992
},
"844ebd79-2adb-4696-a219-9a8fe9fa053c" : {
...
}
}
}
]
}
}
Other annotations
Some types of annotation are not part of the ContextScene format yet. Since they are planned, it seems important to mention them here:
- Positions:
When the spatial extension of objects is useless but only their location is of interest, some Reality Analysis jobs export these positions in ESRI SHP files. This is not yet persisted in a ContextScene. A workaround so far is to the center point of their 3D boxes.
- Tags:
in some cases, no position at all is required, but just a list of labels. For instance: "this image contains people and cars". This case is not available yet and will be added.