General Questions (Forum Closed)


DNA is a different way to store data on a very efficient way like the
SDCS principle from Jan Sloot. Like SDCS it uses Key-codes (DNA-
codes) which are <= 256 bytes by using a Reference Table.

For an online Video Demonstration of DNA, click
In this Forum you can ask questions and follow the developments.
Forumregels
───────────
Use the common rules of internet (netiquette). Discussions about the existence of the DNA-algorithm or about the faking of the Video Demo has no added value to the further development of DNA. Subscribed messages that do not comply with these rules are removed without reason by the moderators. Details about this project are only released by the author and not by the Forum Manager. Starting new topics is allowed when the subject does not exist. This Forum is sharing with anti Spam facilities in the world by using API.

Re: General Questions (Guest threads Allowed)

Berichtdoor Webmaster » zo 22 jan 2012, 13:59

uwequbit schreef:@Webmaster
Who has "translated" my text? Please repeat! ;)

Hello Uwe,

When it is needed, all messages in this Topic will be translated manually
(with original message included). Some people are able to read English but
have difficulties to write it down. This offers the possibility to follow this
Topic by everyone.

Yes, it's repeatable ;)

Best regards, Webmaster
Avatar gebruiker
Webmaster
Beheerder
 
Berichten: 1848
Geregistreerd: za 14 aug 2010, 13:21

Re: General Questions (Guest threads Allowed)

Berichtdoor uwequbit » zo 22 jan 2012, 16:04

Hello Webmaster,

Please enter the text newly translated into the English language. I have posted the second set again.

Thank you (and Google Translator) ;)

regards

uwequbit
uwequbit
 
Berichten: 3
Geregistreerd: zo 30 okt 2011, 11:43

Re: General Questions (Guest threads Allowed)

Berichtdoor Siegmund » zo 22 jan 2012, 19:12

@SDSC
You're welcome.
@Webmaster
Thanks for translating.

Siegmund
Siegmund
 

Re: General Questions (Guest threads Allowed)

Berichtdoor SDSC » wo 25 jan 2012, 22:51

@ uwequbit,

uwequbit schreef:Thank you very much for publishing the documentation. Respect!!!

Thanks.

uwequbit schreef:The more files are compressed, the bigger the reference table will be. Referring to the reference table, is this a mathematical model or does it grow iteratively, so created during compression ? (it would be logical, but it will never reach a 4^255 pattern (4 Byte pattern)).

The reference table for versions 2 of DNA grows iteratively but sequences (patterns) will only be added once. Store data only once!

uwequbit schreef:but it will never reach a 4^255 pattern (4 Byte pattern).

Can you explain more?

uwequbit schreef:How big is one pattern in the reference table ?

The size of a DNA sequence (pattern) in the reference table variates. It depends on the blocks size and the data content of the block. I rather like to use the term factor here which is defined as [block size] / [sequence size]. For the versions 2 of DNA this factor has a minimum of 4.26 and test showed values up to 4.78 always started with an empty reference table and there for representing the worst case situation. The main goal at this moment is to improve this value, the higher the better. The magical value for this factor is 5.68, if this value can be reached, DNA can do without a reference table.

Best regards, SDSC.
The best way to slow down your compateters is to give them your source code!
Avatar gebruiker
SDSC
 
Berichten: 24
Geregistreerd: vr 01 okt 2010, 21:53

Re: General Questions (Guest threads Allowed)

Berichtdoor Karel Jan » do 26 jan 2012, 13:19

Interesting Topic.

This is one of the few Topics I''ll folllow. Thanks for sharing (WP).

Karel Jan
Karel Jan
 

Re: General Questions (Guest threads Allowed)

Berichtdoor David Hofman » zo 29 jan 2012, 12:25

Hmm, after reading your whitepaper, I must admit I'm a bit disappointed. Essentially, you seem to be storing chunks from the input files in the reference table (well, not literally, but that's what it boils down to fundamentally).

This means it only works for storing ('encoding') multiple files if they share common patterns.

If you encode 1000 .zip or .rar files of 1-2MB each, using DNA, you will NOT be able to reduce them all to 256 bytes and keep your reference table at or below 740 MB.

Furthermore, I'm pretty sure that if you compress a bunch of files using Rar or 7-Zip with solid archiving, the resulting archive will always be smaller than encoding it with DNA and taking the resulting reference file size into account. This means there is essentially NO advantage in required disk size to store certain data, or the effective compression ratio achieved by this method.

Maybe I miss the point, but I don't see any revolutionary benefits here? :(
David Hofman
 

Re: General Questions (Guest threads Allowed)

Berichtdoor Eberhard » ma 30 jan 2012, 09:24

Some qoutes from your White Paper
SDSC schreef:By creating unique (DNA) sequences from blocks of data which are equal for every file type,
it would be possible to replace this block by a shorter reference.

I suppose, every file type is including Zip, Rar, 7-zip etc.

SDSC schreef:Try to compress a compressed file again and you will see that it will not compress any more
because all the redundancy was removed at the first run. The algorithms reached the end
of there capacity and this results that the file grows again. (negative compression!)

Known limitations :(
SDSC schreef:DNA is not a replacement of current compression techniques but is an addition to it and they
can work very well together.
Very interesting :)


Some other qoutes
SDSC schreef:For the versions 2 of DNA this factor has a minimum of 4.26 and test showed values up to 4.78
always started with an empty reference table and there for representing the worst case situation.

and
SDSC schreef:DNA in this case needs more files/data to get effective (it has to learn).

Referring to diagram 1, you showed me that this factor grows during learning process.

When I put all qoutes together, it look like you can shrink every compressed file with a faktor
of >= 4.26 with an empty reference table. If that's true, then it is very impressive !
When you can handle the speed, your test version implemented in current compression tools
will give it a boost :roll:


Regards, Eberhard (Röth)
Eberhard
 

Re: General Questions (Guest threads Allowed)

Berichtdoor Eberhard » ma 30 jan 2012, 09:35

It was impossible to modify my earlier reply :(
SDSC schreef:Up to the fifth iteration the key code decreases strong but after that the effect of the
resting iterations gets smaller (thirteen iterations for 750 bytes).

Is here a possiblity to increase your encoding time ?

Eberhard
Eberhard
 

Re: General Questions (Guest threads Allowed)

Berichtdoor David Hofman » ma 30 jan 2012, 10:29

Eberhard schreef:When I put all qoutes together, it look like you can shrink every compressed file with a faktor
of >= 4.26 with an empty reference table. If that's true, then it is very impressive !

It would be very impressive indeed, because this is mathematically impossible :)

You cannot store any possible combination of N bits in K bits, if K < N.

@SDSC: could you please comment on my previous post above? I really hope I'm wrong about this :(
David Hofman
 

Re: General Questions (Guest threads Allowed)

Berichtdoor SDSC » wo 01 feb 2012, 22:00

@ Eberhard,

Eberhard schreef:I suppose, every file type is including Zip, Rar, 7-zip etc.

Yes, DNA processes every file as equal, just a sequence of numbers!

Eberhard schreef:Some other qoutes
SDSC schreef:
For the versions 2 of DNA this factor has a minimum of 4.26 and test showed values up to 4.78
always started with an empty reference table and there for representing the worst case situation.
and
SDSC schreef:
DNA in this case needs more files/data to get effective (it has to learn).
Referring to diagram 1, you showed me that this factor grows during learning process.

Yes.

Eberhard schreef:When I put all qoutes together, it look like you can shrink every compressed file with a faktor
of >= 4.26 with an empty reference table. If that's true, then it is very impressive !

I think there is a little misunderstanding here, let me quote the referring question and answer.

SDSC schreef:uwequbit schreef:
How big is one pattern in the reference table ?
The size of a DNA sequence (pattern) in the reference table variates. It depends on the blocks size and the data content of the block. I rather like to use the term factor here which is defined as [block size] / [sequence size]. For the versions 2 of DNA this factor has a minimum of 4.26 and test showed values up to 4.78 always started with an empty reference table and there for representing the worst case situation. The main goal at this moment is to improve this value, the higher the better. The magical value for this factor is 5.68, if this value can be reached, DNA can do without a reference table.

The factor mentioned here is the factor between block size and sequence size which means that for every block processed, the DNA algorithms will calculate a sequence (pattern) that is at least 4.26 smaller than the size of the processed block. These tests where always started with an empty reference table, this way I could check the maximum amount of data a file would at. (worse case) This does not mean that you can shrink every (compressed) file by a factor of >= 4.26 with an empty reference table!

Eberhard schreef:SDSC schreef:
Up to the fifth iteration the key code decreases strong but after that the effect of the
resting iterations gets smaller (thirteen iterations for 750 bytes).
Is here a possiblity to increase your encoding time ?

I am afraid not, during encoding the last iterations are the fastest once, the algorithms are very heavy and this needs optimisation. In the test version also some brute force programming was used to speed up testing and of course this is not the best and optimal way. :oops:

Best regards, SDSC.
The best way to slow down your compateters is to give them your source code!
Avatar gebruiker
SDSC
 
Berichten: 24
Geregistreerd: vr 01 okt 2010, 21:53

VorigeVolgende

Keer terug naar Forum DNA-Project (Developments) - UK/NL

cron