.. _dtm_onemax: ========================================== DTM + EAP = DEAP : a Distributed Evolution ========================================== As part of the DEAP framework, EAP offers an easy DTM integration. As the EAP algorithms use a map function stored in the toolbox to spawn the individuals evaluations (by default, this is simply the traditional Python :func:`map`), the parallelization can be made very easily, by replacing the map operator in the toolbox : :: from deap import dtm tools.register("map", dtm.map) Thereafter, ensure that your main code is in enclosed in a Python function (for instance, main), and just add the last line : :: dtm.start(main) For instance, take a look at the short version of the onemax. This is how it may be parallelized : :: from deap import dtm creator.create("FitnessMax", base.Fitness, weights=(1.0,)) creator.create("Individual", array.array, typecode='b', fitness=creator.FitnessMax) toolbox = base.Toolbox() # Attribute generator toolbox.register("attr_bool", random.randint, 0, 1) # Structure initializers toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_bool, 100) toolbox.register("population", tools.initRepeat, list, toolbox.individual) def evalOneMax(individual): return sum(individual), toolbox.register("evaluate", evalOneMax) toolbox.register("mate", tools.cxTwoPoints) toolbox.register("mutate", tools.mutFlipBit, indpb=0.05) toolbox.register("select", tools.selTournament, tournsize=3) toolbox.register("map", dtm.map) def main(): random.seed(64) pop = toolbox.population(n=300) hof = tools.HallOfFame(1) stats = tools.Statistics(lambda ind: ind.fitness.values) stats.register("Avg", tools.mean) stats.register("Std", tools.std) stats.register("Min", min) stats.register("Max", max) algorithms.eaSimple(toolbox, pop, cxpb=0.5, mutpb=0.2, ngen=40, stats=stats, halloffame=hof) logging.info("Best individual is %s, %s", hof[0], hof[0].fitness.values) return pop, stats, hof dtm.start(main) # Launch the first task As one can see, parallelization requires almost no changes at all (an import, the selection of the distributed map and the starting instruction), even with a non-trivial program. This program can now be run on a multi-cores computer, on a small cluster or on a supercomputer, without any changes, as long as those environments provide a MPI implementation. .. note:: In this specific case, the distributed version would be actually *slower* than the serial one, because of the extreme simplicity of the evaluation function (which takes *less than 0.1 ms* to execute), as the small overhead generated by the serialization, load-balancing, treatment and transfer of the tasks and the results is not balanced by a gain in the evaluation time. In more complex, real-life problems (for instance sorting networks), the benefit of a distributed version is fairly noticeable.