DOM future work: Difference between revisions

Revision as of 22:37, 23 March 2007

Performance optimizations

One major performance optimization can be had by replacing printf and scanf in daeAtomicType with custom XML aware text parsing functions. This is needed for two reasons:

The speed increase
The standard C string formatting differs from XML string formatting. An example illustrating this is floating point infinity and NaN. XML Schema defines these as INF –INF and NaN. Standard c printf/scanf use #inf, -#inf, #nan.

It may be possible to add accelerator functions to the metaCMPolicy objects. Next to scanf I believe placeElement is the next performance bottleneck.

Class hierarchy reorganization

Many classes inherit from daeElement just because of the smartRef reference counting. This is incredibly ugly! It bloats and complicates those subclasses (other reference counted objects). It would be nice to change that, but maybe a lot of work requiring a minor release, such as DOM 1.3.0.

HexBinary type

The xs:Hexbinary type is not implemented correctly.

Currently nobody uses it so it’s not a big problem. But if someone were to provide <image> it would not be read or written correctly.

Currently HexBinary is defined as a daeCharArray. But it needs to be a two-dimensional array:

 daeTArray< daeTArray< daeUChar > * >

This is because hexbinary is a string of characters encoded in hex. 1A2B3C is 3 bytes (characters). The COLLADA Schema uses a list of HexBinary. So “1A2B3C 4D5E6F” requires two three-character arrays.

The major setback for this to work in the DOM is that there needs to be a new metaAttributeArrayArray type. And the logic would be different than the current metaAttribute and MetaAttributeArray classes.

I don’t know what would need to be done for this to work.

Raw format

I have written a proposal for standardization of external floating point and integer data.

For doing so I added the daeRawResolver, which allows DOM users to have that extra functionality without the need for clients to do any extra work.

The problem is that the URI to specify the .raw file where the data is requires a query string to store some information. daeURI does not have support for the query string. It needs to be added for the .raw and RawResolver to work correctly.

Currently the libxml raw saver and the rawResolver only support 32-bit numbers but the query string “?precision= “ needs to be supported to allow for arbitrary precisions.

I/O plug-in and resolver reorganization

Working on the Verse asset management database I/O plugin and the COLLADA RT I realized that the current structure for I/O plug-ins is insufficient.

The COLLADA DOM should allow for multiple I/O plug-ins to be “registered” with the DOM to allow loading from different sources. (similar to the way OSG IO plug-ins work.)

When doing that the current way resolvers and plug-ins work will need to be reversed. ((ANDY: Is that: "When doing that, the current way that rslvs & pis work will need to be reversed", or is it "When doing that the current way, rslvs & pis work will need to be reversed"?))

Currently there is a list of resolvers. Each resolver can resolve only specific URI schemes and file extensions. If the resolver qualifies to resolve a URI it may (the default one does) call the I/O plug-in to load a document if the document is not already loaded into the database.

The better way would be for a single resolver class that queries the database for a specific element. The database then has a list of I/O plug-ins which can only load from specific URI schemes and file extensions. If the database doesn’t have the document the resolver is searching for it can load the document. The loading would be handled by the most appropriate plugin, i.e. http and file schemes handled by libxml plug-in, verse scheme handled by verse plug-in etc.

SID resolvers

The SID resolver as it stands works.

The COLLADA schema needs to be pushed to give some more semantic meaning to the types it uses. Often there are xs:NCName with semantic meaning but no way to know based on the name, just the context.

The data type should be named SIDType and SIDRef type ((ANDY: do you mean "SIDRefType"?)) to give these NCNames a semantic meaning.

When that happens, the SIDResolver can be made to resolve SIDRef types automatically similar to the way URI and IDRef are resolved automatically upon load.

String table and memory system

Implement them to actually do what they should.

They would both drastically improve DOM memory usage, the stringTable should most likely help a lot more than the memorySystem.

@@ Line 1: / Line 1: @@
 ==Performance optimizations==
-One major performance optimization can be had by replacing <code>printf</code> and <code>scanf</code> in <code>daeAtomicType</code> with custom XML aware text parsing functions. This is needed for two reasons:
+One major performance optimization can be had by replacing '''printf''' and '''scanf''' in '''daeAtomicType''' with custom XML aware text parsing functions. This is needed for two reasons:
-*the speed increase
+*The speed increase
-*the standard C string formatting differs from XML string formatting. An example illustrating this is floating point infinity and NaN. XML Schema defines these as INF –INF and NaN. Standard c printf/scanf use #inf, -#inf, #nan.
+*The standard C string formatting differs from XML string formatting. An example illustrating this is floating point infinity and NaN. XML Schema defines these as INF –INF and NaN. Standard c '''printf/scanf''' use #inf, -#inf, #nan.
-It may be possible to add accelerator functions to the metaCMPolicy objects. Next to scanf I believe placeElement is the next performance bottleneck.
+It may be possible to add accelerator functions to the metaCMPolicy objects. Next to '''scanf''' I believe placeElement is the next performance bottleneck.
 ==Class hierarchy reorganization==
-Many classes inherit from <code>daeElement</code> just because of the smartRef reference counting. This is incredibly ugly! It bloats and complicates those subclasses (other reference counted objects). It would be nice to change that, but maybe a lot of work requiring a minor release, i.e. DOM 1.3.0.
+Many classes inherit from '''daeElement''' just because of the smartRef reference counting. This is incredibly ugly! It bloats and complicates those subclasses (other reference counted objects). It would be nice to change that, but maybe a lot of work requiring a minor release, such as DOM 1.3.0.
 ==HexBinary type==
@@ Line 18: / Line 18: @@
 Currently nobody uses it so it’s not a big problem. But if someone were to provide <image><data> it would not be read or written correctly.
-Currently HexBinary is defined as a daeCharArray. But it needs to be a two-dimensional array daeTArray< daeTArray< daeUChar > * >. This is because hexbinary is a string of characters encoded in hex. 1A2B3C is 3 bytes (characters). The COLLADA Schema uses a list of HexBinary. So “1A2B3C 4D5E6F” requires two three-character arrays.
+Currently HexBinary is defined as a daeCharArray. But it needs to be a two-dimensional array:
+  daeTArray< daeTArray< daeUChar > * >
+This is because hexbinary is a string of characters encoded in hex. 1A2B3C is 3 bytes (characters). The COLLADA Schema uses a list of HexBinary. So “1A2B3C 4D5E6F” requires two three-character arrays.
 The major setback for this to work in the DOM is that there needs to be a new metaAttributeArrayArray type. And the logic would be different than the current metaAttribute and MetaAttributeArray classes.
@@ Line 28: / Line 30: @@
 I have written a proposal for standardization of external floating point and integer data.
-For doing so I added the daeRawResolver, which allows DOM users to have that extra functionality without the need for clients to do any extra work.
+For doing so I added the '''daeRawResolver''', which allows DOM users to have that extra functionality without the need for clients to do any extra work.
 The problem is that the URI to specify the [[.raw file]] where the data is requires a query string to store some information. daeURI does not have support for the query string. It needs to be added for the .raw and RawResolver to work correctly.
@@ Line 34: / Line 36: @@
 Currently the libxml raw saver and the rawResolver only support 32-bit numbers but the query string “?precision= “ needs to be supported to allow for arbitrary precisions.
-==IO plugin and resolver reorganization==
+==I/O plug-in and resolver reorganization==
-Working on the Verse asset management database IO plugin and the COLLADA RT I realized that the current structure for IO plugins is insufficient.
+Working on the Verse asset management database I/O plugin and the COLLADA RT I realized that the current structure for I/O plug-ins is insufficient.
-The COLLADA DOM should allow for multiple IO plugins to be “registered” with the DOM to allow loading from different sources. (similar to the way OSG IO plugins work)
+The COLLADA DOM should allow for multiple I/O plug-ins to be “registered” with the DOM to allow loading from different sources. (similar to the way OSG IO plug-ins work.)
-When doing that the current way resolvers and plugins work will need to be reversed.
+When doing that the current way resolvers and plug-ins work will need to be reversed. ''((ANDY: Is that: "When doing that, the current way that rslvs & pis work will need to be reversed", or is it "When doing that the current way,  rslvs & pis work will need to be reversed"?))''
-Currently there is a list of resolvers. Each resolver can resolve only specific URI schemes and file extensions. If the resolver qualifies to resolve a URI it may (the default one does) call the IO plugin to load a document if the document is not already loaded into the database.
+Currently there is a list of resolvers. Each resolver can resolve only specific URI schemes and file extensions. If the resolver qualifies to resolve a URI it may (the default one does) call the I/O plug-in to load a document if the document is not already loaded into the database.
-The better way would be for a single resolver class that queries the database for a specific element. The database then has a list of IO plugins which can only load from specific URI schemes and file extensions. If the database doesn’t have the document the resolver is searching for it can load the document. The loading would be handled by the most appropriate plugin, i.e. http and file schemes handled by libxml plugin, verse scheme handled by verse plugin etc.
+The better way would be for a single resolver class that queries the database for a specific element. The database then has a list of I/O plug-ins which can only load from specific URI schemes and file extensions. If the database doesn’t have the document the resolver is searching for it can load the document. The loading would be handled by the most appropriate plugin, i.e. http and file schemes handled by '''libxml''' plug-in, verse scheme handled by verse plug-in etc.
 ==SID resolvers==
@@ Line 50: / Line 52: @@
 The SID resolver as it stands works.
-The COLLADA schema needs to be pushed to give some more semantic meaning to the types it uses. Often there are xs:NCName with semantic meaning but no way to know based on the name, just the context.
+The COLLADA schema needs to be pushed to give some more semantic meaning to the types it uses. Often there are '''xs:NCName''' with semantic meaning but no way to know based on the name, just the context.
-The data type should be named SIDType and SIDRef type to give these NCName’s a semantic meaning.
+The data type should be named '''SIDType''' and SIDRef type ''((ANDY: do you mean "SIDRefType"?))'' to give these '''NCName'''s a semantic meaning.
-When that happens the SIDResolver can be made to resolve SIDRef types automatically similar to the way URI and IDRef get resolved automatically upon load.
+When that happens, the '''SIDResolver''' can be made to resolve SIDRef types automatically similar to the way URI and IDRef are resolved automatically upon load.
 ==String table and memory system==
@@ Line 62: / Line 64: @@
 They would both drastically improve DOM memory usage, the stringTable should most likely help a lot more than the memorySystem.
-[[Category:DOM project]]
+[[Category:DOM project|Future work]]

DOM future work: Difference between revisions

Revision as of 22:37, 23 March 2007

Contents

Performance optimizations

Class hierarchy reorganization

HexBinary type

Raw format

I/O plug-in and resolver reorganization

SID resolvers

String table and memory system

Navigation menu

DOM future work: Difference between revisions

Revision as of 22:37, 23 March 2007

Performance optimizations

Class hierarchy reorganization

HexBinary type

Raw format

I/O plug-in and resolver reorganization

SID resolvers

String table and memory system

Navigation menu

Search