This project is read-only.

Is this the latest fork of this project? :)

Feb 28, 2011 at 1:18 AM

I was thinking about making some mods to DupliFinder but didn't notice till just now that this project has made some good advances.

So my questions are:

1)  Do you know of any other forks I may be missing

2)  I assume work has stopped on this project because it was school related?

Great work and regards,

Lee

Mar 2, 2011 at 6:35 AM
  1.  I don't know of any other forks.
  2. Yeah, work stopped--not necessarily because it was school related (there were other things I wanted to do with this project), but mostly because everyone who worked on it is busy.

Feel free to fork away, or if you want to I can make you a maintainer on this project and you're welcome to hack away.  Let me know if you have any questions and I'll be happy to answer.

Mar 2, 2011 at 11:38 PM
Hi, thanks for the info. I've started making changes :).

I would like to get the project robust/performant enough to handle a set of 100,000 images. It seems to make this practical there are a few areas to attack:

1). The O(n(n-1)) number of comparisons, which quickly becomes a large number. However I have a couple ideas on how to reduce it.

2). Loading images. Even though this is an O(n) process, the LoadImage API being used currently is super slow for this usage.

3). The SSIM calculation. I'm guessing this could be a lot faster by using an SIMD helper like Mono.Simd. This uses Intel SS3 to do batch floating point ops.

4). Robustness. Sine this is such a long running operation for n=100000, it really helps to have detailed progress info and possibly the ability to restart after a crash at the place last left off.

Thanks for offering to make me the maintainer but for now I'm just experimenting.

If you have an ideas please drop me a note anytime.

Thanks,
Lee





On Mar 2, 2011, at 12:36 AM, "kidjan" <notifications@codeplex.com> wrote:

From: kidjan

  1. I don't know of any other forks.
  2. Yeah, work stopped--not necessarily because it was school related (there were other things I wanted to do with this project), but mostly because everyone who worked on it is busy.

Feel free to fork away, or if you want to I can make you a maintainer on this project and you're welcome to hack away. Let me know if you have any questions and I'll be happy to answer.

May 11, 2011 at 9:49 PM
Edited May 11, 2011 at 9:49 PM

Sorry, been busy...yeah, handling 100k+ images would be difficult, and like you pointed out the basic issue is the need to compare each image to every other image.  If you wanted to do huge library comparisons, this is really what you'd need to focus on.  To be frank, I think you could test with 5-10 GB of photo data, and even being able to process that much junk in a reasonable amount of time would be a great start.

Never seen Mono.Simd before, but that looks like a great possibility for some significant improvement.  SSIM calculation is definitely slower than MSE/MAE, so that'd be a good place to reap some gains.  SSIM gave us by far the best results, but it was significantly more costly than the error based methods.