Usage
Both deserialization and serialization of PLY file data is done through
PlyData and PlyElement instances.
>>> import numpy
>>> from plyfile import PlyData, PlyElement
>>>
For the code examples that follow, assume the file tet.ply contains
the following text:
ply
format ascii 1.0
comment single tetrahedron with colored faces
element vertex 4
comment tetrahedron vertices
property float x
property float y
property float z
element face 4
property list uchar int vertex_indices
property uchar red
property uchar green
property uchar blue
end_header
0 0 0
0 1 1
1 0 1
1 1 0
3 0 1 2 255 255 255
3 0 2 3 255 0 0
3 0 1 3 0 255 0
3 1 2 3 0 0 255
(This file is available under the examples directory.)
Reading a PLY file
>>> plydata = PlyData.read('tet.ply')
>>>
or
>>> with open('tet.ply', 'rb') as f:
... plydata = PlyData.read(f)
>>>
The static method PlyData.read returns a PlyData instance, which is
plyfile’s representation of the data in a PLY file. A PlyData
instance has an attribute elements, which is a list of PlyElement
instances, each of which has a data attribute which is a numpy
structured array containing the numerical data. PLY file elements map
onto numpy structured arrays in a pretty obvious way. For a list
property in an element, by default, the corresponding numpy field type
is object, with the members being numpy arrays (see the
vertex_indices example below).[1]
Concretely:
>>> plydata.elements[0].name
'vertex'
>>> plydata.elements[0].data[0].tolist()
(0.0, 0.0, 0.0)
>>> plydata.elements[0].data['x']
array([0., 0., 1., 1.], dtype=float32)
>>> plydata['face'].data['vertex_indices'][0]
array([0, 1, 2], dtype=int32)
>>>
For convenience, elements and properties can be looked up by name:
>>> plydata['vertex']['x']
array([0., 0., 1., 1.], dtype=float32)
>>>
and elements can be indexed directly without explicitly going through
the data attribute:
>>> plydata['vertex'][0].tolist()
(0.0, 0.0, 0.0)
>>>
The above expression is equivalent to plydata['vertex'].data[0].
PlyElement instances also contain metadata:
>>> plydata.elements[0].properties
(PlyProperty('x', 'float'), PlyProperty('y', 'float'), PlyProperty('z', 'float'))
>>> plydata.elements[0].count
4
>>>
PlyProperty and PlyListProperty instances are used internally as a
convenient intermediate representation of PLY element properties that
can easily be serialized to a PLY header (using str) or converted to
numpy-compatible type descriptions (via the dtype method). It’s not
extremely common to manipulate them directly, but if needed, the
property metadata of an element can be accessed as a tuple via the
properties attribute (as illustrated above) or looked up by name:
>>> plydata.elements[0].ply_property('x')
PlyProperty('x', 'float')
>>>
Many (but not necessarily all) types of malformed input files will raise
PlyParseError when PlyData.read is called. The string value of the
PlyParseError instance (as well as attributes element, row, and
prop) provides additional context for the error if applicable.
Faster reading via memory mapping
To accelerate parsing of binary data, plyfile can make use of
numpy’s memory mapping facilities. The decision to memory map or not
is made on a per-element basis. To make this determination, there are
two cases to consider.
Case 1: elements with no list properties
If an element in a binary PLY file has no list properties, then it will be memory-mapped by default, subject to the capabilities of the underlying file object.
Memory mapping can be disabled or fine-tuned using the
mmapargument ofPlyData.read.To confirm whether a given element has been memory-mapped or not, check the type of
element.data.
This is all illustrated below:
>>> plydata.text = False
>>> plydata.byte_order = '<'
>>> plydata.write('tet_binary.ply')
>>>
>>> # Memory-mapping is enabled by default.
>>> plydata = PlyData.read('tet_binary.ply')
>>> isinstance(plydata['vertex'].data, numpy.memmap)
True
>>> # Any falsy value disables memory-mapping here.
>>> plydata = PlyData.read('tet_binary.ply', mmap=False)
>>> isinstance(plydata['vertex'].data, numpy.memmap)
False
>>> # Strings can also be given to fine-tune memory-mapping.
>>> # For example, with 'r+', changes can be written back to the file.
>>> # In this case, the file must be explicitly opened with read-write
>>> # access.
>>> with open('tet_binary.ply', 'r+b') as f:
... plydata = PlyData.read(f, mmap='r+')
>>> isinstance(plydata['vertex'].data, numpy.memmap)
True
>>> plydata['vertex']['x'] = 100
>>> plydata['vertex'].data.flush()
>>> plydata = PlyData.read('tet_binary.ply')
>>> all(plydata['vertex']['x'] == 100)
True
>>>
Case 2: elements with list properties
In the general case, elements with list properties cannot be
memory-mapped as numpy arrays, except in one important case: when
all list properties have fixed and known lengths. In that case, the
known_list_len argument can be given to PlyData.read:
>>> plydata = PlyData.read('tet_binary.ply',
... known_list_len={'face': {'vertex_indices': 3}})
>>> isinstance(plydata['face'].data, numpy.memmap)
True
>>>
The implementation will validate the data: if any instance of the list
property has a length other than the value specified, then
PlyParseError will be raised.
Note that in order to enable memory mapping for a given element,
all list properties in the element must have their lengths in the
known_list_len dictionary. If any list property does not have its
length given in known_list_len, then memory mapping will not be
attempted, and no error will be raised.
Creating a PLY file
The first step is to get your data into numpy structured arrays. Note
that there are some restrictions: generally speaking, if you know the
types of properties a PLY file element can contain, you can easily
deduce the restrictions. For example, PLY files don’t contain 64-bit
integer or complex data, so these aren’t allowed.
For convenience, non-scalar fields are allowed, and they will be
serialized as list properties. For example, when constructing a “face”
element, if all the faces are triangles (a common occurrence), it’s okay
to have a “vertex_indices” field of type 'i4' and shape (3,)
instead of type object and shape (). However, if the serialized PLY
file is read back in using plyfile, the “vertex_indices” property will
be represented as an object-typed field, each of whose values is an
array of type 'i4' and length 3. The reason is simply that the PLY
format provides no way to find out that each “vertex_indices” field has
length 3 without actually reading all the data, so plyfile has to
assume that this is a variable-length property. However, see the
FAQ for an easy way to recover a two-dimensional
array from a list property, and also see the notes above
about the known_list_len parameter to speed up the reading of files with
lists of fixed, known length.
For example, if we wanted to create the “vertex” and “face” PLY elements
of the tet.ply data directly as numpy arrays for the purpose of
serialization, we could do this:
>>> vertex = numpy.array([(0, 0, 0),
... (0, 1, 1),
... (1, 0, 1),
... (1, 1, 0)],
... dtype=[('x', 'f4'), ('y', 'f4'),
... ('z', 'f4')])
>>> face = numpy.array([([0, 1, 2], 255, 255, 255),
... ([0, 2, 3], 255, 0, 0),
... ([0, 1, 3], 0, 255, 0),
... ([1, 2, 3], 0, 0, 255)],
... dtype=[('vertex_indices', 'i4', (3,)),
... ('red', 'u1'), ('green', 'u1'),
... ('blue', 'u1')])
>>>
Once you have suitably structured array, the static method
PlyElement.describe can then be used to create the necessary
PlyElement instances:
>>> el = PlyElement.describe(vertex, 'vertex')
>>>
or
>>> el = PlyElement.describe(vertex, 'vertex',
... comments=['comment1',
... 'comment2'])
>>>
Note that there’s no need to create PlyProperty instances explicitly.
This is all done behind the scenes by examining some_array.dtype.descr.
One slight hiccup here is that variable-length fields in a numpy array
(i.e., our representation of PLY list properties)
must have a type of object, so the types of the list length and values
in the serialized PLY file can’t be obtained from the array’s dtype
attribute alone. For simplicity and predictability, the length
defaults to 8-bit unsigned integer, and the value defaults to 32-bit
signed integer, which covers the majority of use cases. Exceptions must
be stated explicitly:
>>> el = PlyElement.describe(face, 'face',
... val_types={'vertex_indices': 'u2'},
... len_types={'vertex_indices': 'u4'})
>>>
Now you can instantiate PlyData and serialize:
>>> PlyData([el]).write('some_binary.ply')
>>> PlyData([el], text=True).write('some_ascii.ply')
>>>
>>> # Force the byte order of the output to big-endian, independently of
>>> # the machine's native byte order
>>> PlyData([el],
... byte_order='>').write('some_big_endian_binary.ply')
>>>
>>> # Use a file object. Binary mode is used here, which will cause
>>> # Unix-style line endings to be written on all systems.
>>> with open('some_ascii.ply', mode='wb') as f:
... PlyData([el], text=True).write(f)
>>>
Miscellaneous
Getting a two-dimensional array from a list property
The PLY format provides no way to assert that all the data for a given
list property is of the same length, yet this is a relatively common
occurrence. For example, all the “vertex_indices” data on a “face”
element will have length three for a triangular mesh. In such cases,
it’s usually much more convenient to have the data in a two-dimensional
array, as opposed to a one-dimensional array of type object. Here’s a
pretty easy way to obtain a two dimensional array:
>>> plydata = PlyData.read('tet.ply')
>>> tri_data = plydata['face'].data['vertex_indices']
>>> triangles = numpy.vstack(tri_data)
>>>
(If the row lengths of all list properties are known in advance, the
known_list_len parameter can also be used.)
Instance mutability
A plausible code pattern is to read a PLY file into a PlyData
instance, perform some operations on it, possibly modifying data and
metadata in place, and write the result to a new file. This pattern is
partially supported. The following in-place mutations are possible:
Modifying numerical array data only.
Assigning directly to a
PlyDatainstance’selements.Switching format by changing the
textandbyte_orderattributes of aPlyDatainstance. This will switch betweenascii,binary_little_endian, andbinary_big_endianPLY formats.Modifying a
PlyDatainstance’scommentsandobj_info, and modifying aPlyElementinstance’scomments.Assigning to an element’s
data. Note that the property metadata inpropertiesis not touched by this, so for every property in thepropertieslist of thePlyElementinstance, thedataarray must have a field with the same name (but possibly different type, and possibly in different order). The array can have additional fields as well, but they won’t be output when writing the element to a PLY file. The properties in the output file will appear as they are in thepropertieslist. If an array field has a different type than the correspondingPlyPropertyinstance, then it will be cast when writing.Assigning directly to an element’s
properties. Note that thedataarray is not touched, and the previous note regarding the relationship betweenpropertiesanddatastill applies: the field names ofdatamust be a superset of the property names inproperties, but they can be in a different order and specify different types.Changing a
PlyPropertyorPlyListPropertyinstance’sval_dtypeor aPlyListPropertyinstance’slen_dtype, which will perform casting when writing.
Modifying the name of a PlyElement, PlyProperty, or
PlyListProperty instance is not supported and will raise an error. To
rename a property of a PlyElement instance, you can remove the
property from properties, rename the field in data, and re-add the
property to properties with the new name by creating a new
PlyProperty or PlyListProperty instance:
>>> from plyfile import PlyProperty, PlyListProperty
>>> face = plydata['face']
>>> face.properties = ()
>>> face.data.dtype.names = ['idx', 'r', 'g', 'b']
>>> face.properties = (PlyListProperty('idx', 'uchar', 'int'),
... PlyProperty('r', 'uchar'),
... PlyProperty('g', 'uchar'),
... PlyProperty('b', 'uchar'))
>>>
Note that it is always safe to create a new PlyElement or PlyData
instance instead of modifying one in place, and this is the recommended
style:
>>> # Recommended:
>>> plydata = PlyData([plydata['face'], plydata['vertex']],
... text=False, byte_order='<')
>>>
>>> # Also supported:
>>> plydata.elements = [plydata['face'], plydata['vertex']]
>>> plydata.text = False
>>> plydata.byte_order = '<'
>>> plydata.comments = []
>>> plydata.obj_info = []
>>>
Objects created by this library don’t claim ownership of the other
objects they refer to, which has implications for both styles (creating
new instances and modifying in place). For example, a single
PlyElement instance can be contained by multiple PlyData instances,
but modifying that instance will then affect all of those containing
PlyData instances.
Text-mode streams
Input and output on text-mode streams is supported for ASCII-format
PLY files, but not binary-format PLY files. Input and output on
binary streams is supported for all valid PLY files. Note that
sys.stdout and sys.stdin are text streams, so they can only be
used directly for ASCII-format PLY files.
Comments
Header comments are supported:
obj_infocomments are supported as well:When written, they will be placed after regular comments after the “format” line.
Comments can have leading whitespace, but trailing whitespace may be stripped and should not be relied upon. Comments may not contain embedded newlines.