Assuming I understood the problem correctly the goal is to identify complete unique graphs in the provided list of edges. Example output could probably look something like:
[[2901_1, 3070_1], [10397_1, 3070_1], [10543_1, 3070_1], [3070_1, 2356_1], [2356_1, 106_1], ... ],
[...],
[...]
Basically a collection of graphs, where each graphs is a collection of edges, where each edge connects two vertices. Could be in XML or JSON notation as well. There're many options.
I've taken real IDs from sample data, although even this first graph isn't complete - I stopped looking for more nodes when I realized it could ponentially become quite huge.
Anyway, it looks like an interesting problem from graphs theory and something I'd be happy to think about more. It could certainly be solved in almost any language, but I would probably have a go with Java or C# as these are nicely objective languages and I've used them both for several years full time.
Also, I've noticed some nodes that seem to be connected only to themselves (e.g. 10002_1) - are they 'degenerated', one-node networks, or should simply be ignored?
Let me know in case of any additional questions.
Regards,
Luke