DOM future work: Difference between revisions

From COLLADA Public Wiki
Jump to navigation Jump to search
Alorino (talk | contribs)
 
(13 intermediate revisions by 4 users not shown)
Line 1: Line 1:
The [[COLLADA DOM]], as with any product, has some areas that could be expanded or improved. This article lists these ideas and recommendations.
==Discussing and implementing changes==
*Add additional suggestions to this article by clicking the '''edit''' tab.
*Discuss proposed changes by clicking the '''discussion'' tab to get to this article's associated [[talk page]], then click '''edit'''.
*Anyone who wants to contribute code that includes any of these enhancements to the open-source DOM may do so by emailing Steven Thomas at [email protected].
==Build improvements==
Our build setup could use a few improvements. I've written up bugs in SourceForge to track the problems.
*[http://sourceforge.net/tracker/index.php?func=detail&aid=1755749&group_id=157838&atid=805424 Multiple trunks should be merged]
*[http://sourceforge.net/tracker/index.php?func=detail&aid=1755751&group_id=157838&atid=805424 Library dependencies should be explicit in the build tools]
*[http://sourceforge.net/tracker/index.php?func=detail&aid=1755752&group_id=157838&atid=805424 Windows build shouldn't need environment variables]
*[http://sourceforge.net/tracker/index.php?func=detail&aid=1755754&group_id=157838&atid=805424 Third party libraries needed by the DOM should be included]
*[http://sourceforge.net/tracker/index.php?func=detail&aid=1755756&group_id=157838&atid=805424 DOM: VS2003 and VS2005 don't build side by side]
==Performance optimizations==
==Performance optimizations==
''Contributed by [[User:alorino|Andy Lorino]], March 2007.''


One major performance optimization can be had by replacing '''printf''' and '''scanf''' in '''daeAtomicType''' with custom XML aware text parsing functions. This is needed for two reasons:
One major performance optimization can be had by replacing '''printf''' and '''scanf''' in '''daeAtomicType''' with customized, XML-aware text-parsing functions. This is needed for two reasons:


*The speed increase
*The speed increase.
*The standard C string formatting differs from XML string formatting. An example illustrating this is floating point infinity and NaN. XML Schema defines these as INF –INF and NaN. Standard c '''printf/scanf''' use #inf, -#inf, #nan.
*The standard C string formatting differs from XML string formatting. An example illustrating this is floating point infinity and NaN. XML Schema defines these as INF –INF and NaN. Standard C '''printf/scanf''' use #inf, -#inf, #nan.


It may be possible to add accelerator functions to the metaCMPolicy objects. Next to '''scanf''' I believe placeElement is the next performance bottleneck.
It may be possible to add accelerator functions to the '''metaCMPolicy''' objects. After '''scanf''', I believe that '''placeElement''' is the next performance bottleneck.


==Class hierarchy reorganization==
==Class hierarchy reorganization==
''Contributed by Andy Lorino, March 2007.''


Many classes inherit from '''daeElement''' just because of the smartRef reference counting. This is incredibly ugly! It bloats and complicates those subclasses (other reference counted objects). It would be nice to change that, but maybe a lot of work requiring a minor release, such as DOM 1.3.0.
Many classes inherit from '''[[daeElement]]''' just because of the '''[[smartRef]]''' reference counting. This is bad; it bloats and complicates those subclasses (other reference counted objects). It would be nice to change that, but might be a lot of work requiring a minor release, such as DOM 1.3.0.


==HexBinary type==
==HexBinary type==
''Contributed by Andy Lorino, March 2007.''


The xs:Hexbinary type is not implemented correctly.  
The '''xs:Hexbinary type''' is not implemented correctly.  


Currently nobody uses it so it’s not a big problem. But if someone were to provide <image><data> it would not be read or written correctly.
Currently nobody uses it so it’s not a big problem. But if someone were to provide '''<image><data>''' it would not be read or written correctly.


Currently HexBinary is defined as a daeCharArray. But it needs to be a two-dimensional array:
Currently, '''HexBinary''' is defined as a '''daeCharArray'''. But it needs to be a two-dimensional array:
   daeTArray< daeTArray< daeUChar > * >
   daeTArray< daeTArray< daeUChar > * >
This is because hexbinary is a string of characters encoded in hex. 1A2B3C is 3 bytes (characters). The COLLADA Schema uses a list of HexBinary. So “1A2B3C 4D5E6F” requires two three-character arrays.
This is because HexBinary is a string of characters encoded in hex. 1A2B3C is 3 bytes (characters). The [[COLLADA Schema]] uses a list of HexBinary. So “1A2B3C 4D5E6F” requires two three-character arrays.


The major setback for this to work in the DOM is that there needs to be a new metaAttributeArrayArray type. And the logic would be different than the current metaAttribute and MetaAttributeArray classes.
The major setback for this to work in the DOM is that there needs to be a new '''metaAttributeArrayArray''' type. And the logic would be different than the current '''metaAttribute''' and '''MetaAttributeArray''' classes.


I don’t know what would need to be done for this to work.
I don’t know what would need to be done for this to work.
''Contributed by Mick Pearson, March 2016.''
Ah! I finally understood the logic here. Follow me. The 1.5 specification, or the PDF anyway, lists list_of_hexBinary_type as a type. But this is not the type of the <hex> element. Its type is hexBinary. Not a list. This list type is not used in the specification because it's provided simply for encoding arbitrary user data.
Now in domImage_source.h, it defines: '''domList_of_hex_binary& getValue() { return _value; }''' --however this is simply incorrect. It should be xsHexBinary, which does not have a typedef, however it is defined in daeAtomicType.cpp separately from xsHexBinaryArray. So this is the source of the confusion. It's a mistake in daeDomTypes.h.
Furthermore these types most definitely should not hold character data. The array must be defined in terms of bytes and must be contiguous. daeArray does not strictly meet these requirements on systems that are not byte addressable. If only the <hex> element had a Required attribute called "count", then Collada DOM could round down to this number. In this case something like: '''class daeBinArray : public daeTArray<char>{ size_t _octets; ... };''' would do. Small edit--it would be unlike Collada DOM to consult a "count" attribute. It's not that smart, although it might be nice were it so.


==Raw format==
==Raw format==
''Contributed by Andy Lorino, March 2007.''


I have written a proposal for standardization of external floating point and integer data.
I have written a proposal for standardization of external floating point and integer data. {{editor | what= Where is this proposal located?}}


For doing so I added the '''daeRawResolver''', which allows DOM users to have that extra functionality without the need for clients to do any extra work.
For doing so, I added the '''daeRawResolver''', which allows DOM users to have that extra functionality without the need for clients to do any extra work.


The problem is that the URI to specify the [[.raw file]] where the data is requires a query string to store some information. daeURI does not have support for the query string. It needs to be added for the .raw and RawResolver to work correctly.
The problem is that the URI to specify the [[.raw file]] where the data is requires a query string to store some information. '''daeURI''' does not have support for the query string. It needs to be added for the .raw and RawResolver to work correctly.


Currently the libxml raw saver and the rawResolver only support 32-bit numbers but the query string “?precision= “ needs to be supported to allow for arbitrary precisions.
Currently the libxml raw saver and the rawResolver support only 32-bit numbers, but the query string “?precision= “ needs to be supported to allow for arbitrary precisions.


==I/O plug-in and resolver reorganization==
==I/O plug-in and resolver reorganization==
''Contributed by Andy Lorino, March 2007.''


Working on the Verse asset management database I/O plugin and the COLLADA RT I realized that the current structure for I/O plug-ins is insufficient.
Working on the Verse asset management database [[I/O plug-in]] and the [[COLLADA Runtime]], I realized that the current structure for I/O plug-ins is insufficient.


The COLLADA DOM should allow for multiple I/O plug-ins to be “registered” with the DOM to allow loading from different sources. (similar to the way OSG IO plug-ins work.)
The COLLADA DOM should allow for multiple I/O plug-ins to be “registered” with the DOM to allow loading from different sources, similar to the way OSG I/O plug-ins work.


When doing that, the relationship between resolvers and plug-ins will need to be reversed.  
When doing that, the relationship between [[resolver]]s and plug-ins will need to be reversed.  


Currently there is a list of resolvers. Each resolver can resolve only specific URI schemes and file extensions. If the resolver qualifies to resolve a URI it may (the default one does) call the I/O plug-in to load a document if the document is not already loaded into the database.
Currently there is a list of resolvers. Each resolver can resolve only specific URI schemes and file extensions. If the resolver qualifies to resolve a URI, it may (the default one does) call the I/O plug-in to load a document if the document is not already loaded into the database.


The better way would be for a single resolver class that queries the database for a specific element. The database then has a list of I/O plug-ins which can only load from specific URI schemes and file extensions. If the database doesn’t have the document the resolver is searching for it can load the document. The loading would be handled by the most appropriate plugin, i.e. http and file schemes handled by '''libxml''' plug-in, verse scheme handled by verse plug-in etc.
The better way would be for a single resolver class that queries the database for a specific element. The database then has a list of I/O plug-ins which can load only from specific URI schemes and file extensions. If the database doesn’t have the document for which the resolver is searching, it can load the document. The loading would be handled by the most appropriate plug-in, for example, http and file schemes handled by '''libxml''' plug-in, or Verse schemes handled by Verse plug-in.


==SID resolvers==
==SID resolvers==
''Contributed by Andy Lorino, March 2007.''


The SID resolver as it stands works.
The [[SID resolver]], as it stands, works. (See [[DOM guide: Resolving SIDs]].)


The COLLADA schema needs to be pushed to give some more semantic meaning to the types it uses. Often there are '''xs:NCName''' with semantic meaning but no way to know based on the name, just the context.
The [[COLLADA schema]] needs to be pushed to give more semantic meaning to the types that it uses. Often there are '''xs:NCName''' elements with semantic meaning but no way to know based on the name, just the context.


The data type should be named '''SIDType''' and '''SIDRefType''' to give these '''NCName'''s a semantic meaning.
The data type should be named '''SID''' and '''SIDRef''' (or something similar) to give these '''NCName'''s a semantic meaning.


When that happens, the '''SIDResolver''' can be made to resolve SIDRef types automatically similar to the way URI and IDRef are resolved automatically upon load.
When that happens, the '''SIDResolver''' can be made to resolve '''SIDRef''' types automatically, similar to the way URI and IDRef are resolved automatically upon load.


==String table and memory system==
==String table and memory system==
''Contributed by Andy Lorino, March 2007.''
Implement them to actually do what they should. {{editor | what= And what is it that they should do? Are these discussed in the related articles?}}


Implement them to actually do what they should.
They would both drastically improve memory usage in the DOM. The stringTable should most likely help a lot more than the memorySystem.


They would both drastically improve DOM memory usage, the stringTable should most likely help a lot more than the memorySystem.
{{DOM navigation}}


[[Category:DOM project|Future work]]
[[Category:COLLADA DOM|Future work]]

Latest revision as of 09:13, 8 March 2016

The COLLADA DOM, as with any product, has some areas that could be expanded or improved. This article lists these ideas and recommendations.

Discussing and implementing changes

  • Add additional suggestions to this article by clicking the edit tab.
  • Discuss proposed changes by clicking the discussion tab to get to this article's associated talk page, then click edit'.
  • Anyone who wants to contribute code that includes any of these enhancements to the open-source DOM may do so by emailing Steven Thomas at [email protected].

Build improvements

Our build setup could use a few improvements. I've written up bugs in SourceForge to track the problems.

Performance optimizations

Contributed by Andy Lorino, March 2007.

One major performance optimization can be had by replacing printf and scanf in daeAtomicType with customized, XML-aware text-parsing functions. This is needed for two reasons:

  • The speed increase.
  • The standard C string formatting differs from XML string formatting. An example illustrating this is floating point infinity and NaN. XML Schema defines these as INF –INF and NaN. Standard C printf/scanf use #inf, -#inf, #nan.

It may be possible to add accelerator functions to the metaCMPolicy objects. After scanf, I believe that placeElement is the next performance bottleneck.

Class hierarchy reorganization

Contributed by Andy Lorino, March 2007.

Many classes inherit from daeElement just because of the smartRef reference counting. This is bad; it bloats and complicates those subclasses (other reference counted objects). It would be nice to change that, but might be a lot of work requiring a minor release, such as DOM 1.3.0.

HexBinary type

Contributed by Andy Lorino, March 2007.

The xs:Hexbinary type is not implemented correctly.

Currently nobody uses it so it’s not a big problem. But if someone were to provide <image> it would not be read or written correctly.

Currently, HexBinary is defined as a daeCharArray. But it needs to be a two-dimensional array:

 daeTArray< daeTArray< daeUChar > * >

This is because HexBinary is a string of characters encoded in hex. 1A2B3C is 3 bytes (characters). The COLLADA Schema uses a list of HexBinary. So “1A2B3C 4D5E6F” requires two three-character arrays.

The major setback for this to work in the DOM is that there needs to be a new metaAttributeArrayArray type. And the logic would be different than the current metaAttribute and MetaAttributeArray classes.

I don’t know what would need to be done for this to work.

Contributed by Mick Pearson, March 2016.

Ah! I finally understood the logic here. Follow me. The 1.5 specification, or the PDF anyway, lists list_of_hexBinary_type as a type. But this is not the type of the <hex> element. Its type is hexBinary. Not a list. This list type is not used in the specification because it's provided simply for encoding arbitrary user data.

Now in domImage_source.h, it defines: domList_of_hex_binary& getValue() { return _value; } --however this is simply incorrect. It should be xsHexBinary, which does not have a typedef, however it is defined in daeAtomicType.cpp separately from xsHexBinaryArray. So this is the source of the confusion. It's a mistake in daeDomTypes.h.

Furthermore these types most definitely should not hold character data. The array must be defined in terms of bytes and must be contiguous. daeArray does not strictly meet these requirements on systems that are not byte addressable. If only the <hex> element had a Required attribute called "count", then Collada DOM could round down to this number. In this case something like: class daeBinArray : public daeTArray<char>{ size_t _octets; ... }; would do. Small edit--it would be unlike Collada DOM to consult a "count" attribute. It's not that smart, although it might be nice were it so.

Raw format

Contributed by Andy Lorino, March 2007.

I have written a proposal for standardization of external floating point and integer data. ((EDITOR: This page needs the following improvement: Where is this proposal located? ))


For doing so, I added the daeRawResolver, which allows DOM users to have that extra functionality without the need for clients to do any extra work.

The problem is that the URI to specify the .raw file where the data is requires a query string to store some information. daeURI does not have support for the query string. It needs to be added for the .raw and RawResolver to work correctly.

Currently the libxml raw saver and the rawResolver support only 32-bit numbers, but the query string “?precision= “ needs to be supported to allow for arbitrary precisions.

I/O plug-in and resolver reorganization

Contributed by Andy Lorino, March 2007.

Working on the Verse asset management database I/O plug-in and the COLLADA Runtime, I realized that the current structure for I/O plug-ins is insufficient.

The COLLADA DOM should allow for multiple I/O plug-ins to be “registered” with the DOM to allow loading from different sources, similar to the way OSG I/O plug-ins work.

When doing that, the relationship between resolvers and plug-ins will need to be reversed.

Currently there is a list of resolvers. Each resolver can resolve only specific URI schemes and file extensions. If the resolver qualifies to resolve a URI, it may (the default one does) call the I/O plug-in to load a document if the document is not already loaded into the database.

The better way would be for a single resolver class that queries the database for a specific element. The database then has a list of I/O plug-ins which can load only from specific URI schemes and file extensions. If the database doesn’t have the document for which the resolver is searching, it can load the document. The loading would be handled by the most appropriate plug-in, for example, http and file schemes handled by libxml plug-in, or Verse schemes handled by Verse plug-in.

SID resolvers

Contributed by Andy Lorino, March 2007.

The SID resolver, as it stands, works. (See DOM guide: Resolving SIDs.)

The COLLADA schema needs to be pushed to give more semantic meaning to the types that it uses. Often there are xs:NCName elements with semantic meaning but no way to know based on the name, just the context.

The data type should be named SID and SIDRef (or something similar) to give these NCNames a semantic meaning.

When that happens, the SIDResolver can be made to resolve SIDRef types automatically, similar to the way URI and IDRef are resolved automatically upon load.

String table and memory system

Contributed by Andy Lorino, March 2007.

Implement them to actually do what they should. ((EDITOR: This page needs the following improvement: And what is it that they should do? Are these discussed in the related articles? ))


They would both drastically improve memory usage in the DOM. The stringTable should most likely help a lot more than the memorySystem.


COLLADA DOM - Version 2.4 Historical Reference
List of main articles under the DOM portal.
User Guide chapters:  • Intro  • Architecture  • Setting up  • Working with documents  • Creating docs  • Importing docs  • Representing elements  • Working with elements  • Resolving URIs  • Resolving SIDs  • Using custom COLLADA data  • Integration templates  • Error handling

Systems:  • URI resolver  • Meta  • Load/save flow  • Runtime database  • Memory • StringRef  • Code generator
Additional information:  • What's new  • Backward compatibility  • Future work
Terminology categories:  • COLLADA  • DOM  • XML