Results 1 to 2 of 2

Thread: Count, filter out duplicates for dictionaries

  1. #1

    Default Count, filter out duplicates for dictionaries

    I have an iterator in which it returns me the following - item_name, item_size, user_name

    What is the best way in which I can use if I wanted to:
    • Collate similar item_namings into a 1-liner
    • Prior to point #1, calculate the number of items
    • Prior to point #1, it will also displays the user_names affiliated with the versions and the size it used in descending order

    Currently I am using a lot of dictionaries and I am not sure what is the best way to approach this..
    gen_dict = {}
    size_dict = {}
    # my_iterator is the one that I have mentioned as above
    for result in my_iterator:
        gen_dict[result['object_name']] = result['user']
        size_dict[result['user]] = result['dir_size']
       # If same key exists, append value to existing key
       if result['owner'] in size_dict:
    # Filter out duplicates, count versions
    asset_user_dict = defaultdict(set)
    asset_count = defaultdict(int)
    user_ver_count = defaultdict(lambda: defaultdict(int))
    for vers_name, artist_alias in ivy_results.iteritems():
        strip_version_name = vers_name[:-3]
           asset_count[strip_version_name] += 1
           user_ver_count[artist_alias][strip_version_name] += 1
    # Gather the sum of all item's size accordingly to each user
    for user_name, user_size in size_dict.iteritems():
        # This will sums up all sizes of that particular user
        size_dict[stalk_name] = sum(user_size)
    for version_name, version_count in sorted(asset_count.iteritems()):
        user_vers_cnt = ', '.join('{0}({1}v, {2})'.format(user, user_ver_count[user][version_name], convert_size_query(ivy_size_query[user])) for user in asset_user_dict[version_name])
        print "| {0:<100} | {1:>12} | {2:>90} |".format(version_name+"(xxx)",
    I tried using dictionary but while I can do almost all the above 3 points, I am having issues with point #3 where I either can't seem to sort them in order or the size dervied for the user are of the same value as I am using multi dictionaries? Any advice is greatly appreciated!

    By the way, my output currently is:
    Suppose if my data is something like
    (1 MiB) "item_C_v001" : "jack"
    (5 MiB) "item_C_v002" : "kris"
    (1 MiB) "item_A_v003" : "john",
    (1 MiB) "item_B_v006" : "peter",
    (2 MiB) "item_A_v005" : "john",
    (1 MiB) "item_A_v004" : "dave"
    Item Name     | No. of Vers.      | User
    item_A           | 3                     | dave(1, 1MiB), john(2, 3MiB)
    item_B           | 1                     | peter(1, 1MiB)
    item_C           | 2                     | kris(1, 5MiB), jack(1, 1MiB)
    Last edited by xenas; 02-16-2017 at 10:13 PM.

  2. #2
    float Claudio A's Avatar
    Join Date
    Feb 2012


    Splitting this into three dictionaries is probably overkill here (I might be wrong). You can probably get away with a single dict or OrderedDict and do your collecting in a single pass which would probably be more efficient.

    From the snippet you included I assume your iterator returns a dictionary for each 'result'...

    from collections import OrderedDict
    data = OrderedDict()
    for result in my_iterator:
        item_name = result.get('object_name', None)
        item_user = result.get('user', None)
        item_size = result.get('dir_size', 0)
        if item_name not in data:
            data[item_name] = {item_user: [item_size]}
            if item_user not in data[item_name]:
                data[item_name][item_user] = [item_size]
    Your resulting data OrderedDict should be structured so that a single mapped item_name is as follows:

    # data[item_name1] = {item_user1: [item_size1, item_size2], item_user2: [item_size1, item_size2], etc...}
    From this you can derive the information you need.

    for item_name, item_users in data.iteritems():
        count = sum(len(item_sizes) for item_sizes in item_users.values())
        'Item Name: {} Count: {} User(s): {}'.format(item_name, count, ', '.join(item_users.keys()))

    DISCLAIMER: untested!

    Hopefully this helps you look at it under another angle.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts