Skip to main content
added 202 characters in body
Source Link
Graipher
  • 41.7k
  • 7
  • 70
  • 134

One advantage of the logger module is that you can pass it strings and objects to log at some level, and the string formatting is only performed if it is actually needed (i.e. the logging level is low enough). Also, having the gc.collect() actually hides it a lot, I was looking for it for some time. So I would do this instead:

def wipe_sensitive_info(d, keys):
    """
    In Python we are not able to overwrite memory areas. What we do is:
    - remove references to the specified objects, in this case keys in a dictionary
    - immediately call garbage collection
    """
    # TODO: the following points are pending
    #   - this relies on the gc to remove the objects from memory (de-allocate memory).
    #   - the memory itself is not guaranteed to be overwritten.
    #   - this will only deallocate objects which are not being referred somewhere else (ref count 0 after del)
    for key in keys:
        del d[key]
    n_garbage_collected = gc.collect()
    logger.info("Garbage collected %s objects", n_garbage_collected)

You could add a check for objects which have reference counts left:

import sys

...

for key in keys:
    if sys.getrefcount(d[key]) > 12:
        # The count returned is generally one higher than you might expect,
        # because it includes the (temporary) reference as an argument to
        # getrefcount().
        logger.debug("Object potentially has references left: %s", key)
    del d[key]

Note that this might give you some false positives for interned objects, like small integers or strings. Case in point, in an interactive session, sys.getrefcount("s") just gave me 1071.

One advantage of the logger module is that you can pass it strings and objects to log at some level, and the string formatting is only performed if it is actually needed (i.e. the logging level is low enough). Also, having the gc.collect() actually hides it a lot, I was looking for it for some time. So I would do this instead:

def wipe_sensitive_info(d, keys):
    """
    In Python we are not able to overwrite memory areas. What we do is:
    - remove references to the specified objects, in this case keys in a dictionary
    - immediately call garbage collection
    """
    # TODO: the following points are pending
    #   - this relies on the gc to remove the objects from memory (de-allocate memory).
    #   - the memory itself is not guaranteed to be overwritten.
    #   - this will only deallocate objects which are not being referred somewhere else (ref count 0 after del)
    for key in keys:
        del d[key]
    n_garbage_collected = gc.collect()
    logger.info("Garbage collected %s objects", n_garbage_collected)

You could add a check for objects which have reference counts left:

import sys

...

for key in keys:
    if sys.getrefcount(d[key]) > 1:
        logger.debug("Object has references left: %s", key)
    del d[key]

One advantage of the logger module is that you can pass it strings and objects to log at some level, and the string formatting is only performed if it is actually needed (i.e. the logging level is low enough). Also, having the gc.collect() actually hides it a lot, I was looking for it for some time. So I would do this instead:

def wipe_sensitive_info(d, keys):
    """
    In Python we are not able to overwrite memory areas. What we do is:
    - remove references to the specified objects, in this case keys in a dictionary
    - immediately call garbage collection
    """
    # TODO: the following points are pending
    #   - this relies on the gc to remove the objects from memory (de-allocate memory).
    #   - the memory itself is not guaranteed to be overwritten.
    #   - this will only deallocate objects which are not being referred somewhere else (ref count 0 after del)
    for key in keys:
        del d[key]
    n_garbage_collected = gc.collect()
    logger.info("Garbage collected %s objects", n_garbage_collected)

You could add a check for objects which have reference counts left:

import sys

...

for key in keys:
    if sys.getrefcount(d[key]) > 2:
        # The count returned is generally one higher than you might expect,
        # because it includes the (temporary) reference as an argument to
        # getrefcount().
        logger.debug("Object potentially has references left: %s", key)
    del d[key]

Note that this might give you some false positives for interned objects, like small integers or strings. Case in point, in an interactive session, sys.getrefcount("s") just gave me 1071.

Source Link
Graipher
  • 41.7k
  • 7
  • 70
  • 134

One advantage of the logger module is that you can pass it strings and objects to log at some level, and the string formatting is only performed if it is actually needed (i.e. the logging level is low enough). Also, having the gc.collect() actually hides it a lot, I was looking for it for some time. So I would do this instead:

def wipe_sensitive_info(d, keys):
    """
    In Python we are not able to overwrite memory areas. What we do is:
    - remove references to the specified objects, in this case keys in a dictionary
    - immediately call garbage collection
    """
    # TODO: the following points are pending
    #   - this relies on the gc to remove the objects from memory (de-allocate memory).
    #   - the memory itself is not guaranteed to be overwritten.
    #   - this will only deallocate objects which are not being referred somewhere else (ref count 0 after del)
    for key in keys:
        del d[key]
    n_garbage_collected = gc.collect()
    logger.info("Garbage collected %s objects", n_garbage_collected)

You could add a check for objects which have reference counts left:

import sys

...

for key in keys:
    if sys.getrefcount(d[key]) > 1:
        logger.debug("Object has references left: %s", key)
    del d[key]