# Pack, Pickle, Eval: A Field Guide to Storing and Running Code
There's a surprisingly rich vocabulary for one of computing's oldest problems: how do you take code or data, put it somewhere, and then bring it back to life later?
Every era of programming has invented its own terms. Some describe the storage side — packing code up for transport. Others describe the execution side — running code that was stored as data. Many terms cross boundaries or mean completely different things depending on context. This is a tour through all of them.
## The Storage Side: Packing Code Up
### Pack / Unpack
Perl's `pack()` and `unpack()` were fundamental operations — encoding structured data into a binary string and extracting it back out.
```perl
# Perl: pack an IP address into 4 bytes
my $packed = pack("C4", 192, 168, 1, 1);
# Unpack it back
my @octets = unpack("C4", $packed);
```
The terms are intuitive because they map to physical reality. You pack a suitcase, ship it somewhere, and unpack it. The contents aren't changed — just compressed into a transportable form.
**Where you see it:** Binary protocols, network programming, file format parsing, C structs. Python's `struct.pack()` carries the same concept forward.
### Marshal / Unmarshal
Marshalling is converting an object or data structure into a format suitable for storage or transmission. The term comes from "marshalling" troops — organizing things into proper order for transport.
```python
# Ruby uses Marshal explicitly
data = Marshal.dump({name: "task", status: "open"}) # => binary string
obj = Marshal.load(data) # => original hash
# In RPC (Remote Procedure Calls), marshalling converts
# function arguments into bytes for transmission across a network
```
**The nuance:** Marshalling implies more than just serialization. It includes converting data between different representations — like translating between how Python stores an integer and how C stores one. COM, CORBA, and Java RMI all used "marshalling" prominently.
**Where you see it:** RPC systems, cross-language communication, Go's `encoding/json.Marshal()`, Windows COM interop.
### Pickle / Unpickle
Python's term for serializing arbitrary objects to bytes. The metaphor is preserving food — you pickle cucumbers to store them, then eat them months later. Same idea with code and data.
```python
import pickle
# Pickle a function's state
data = {"callback": my_function, "args": [1, 2, 3]}
stored = pickle.dumps(data)
# Months later, unpickle and run it
restored = pickle.loads(stored)
restored["callback"](*restored["args"])
```
**The danger:** Pickle can serialize almost anything, including executable code. Unpickling untrusted data is a security vulnerability — the pickled payload can execute arbitrary code on load. This is eval masquerading as deserialization.
**Where you see it:** Python everywhere — Django cache, Celery task arguments, scikit-learn model saving, Redis storage.
### Freeze / Thaw
Perl's `Storable` module used `freeze()` and `thaw()`. The metaphor: flash-freeze your data structure, store it indefinitely, thaw it when needed. The object comes back exactly as it was.
```perl
use Storable qw(freeze thaw);
my $frozen = freeze(\%complex_data);
# Store $frozen in a file, database, wherever
my $thawed = thaw($frozen);
# $thawed is now identical to the original
```
**The nuance:** Freeze/thaw implies perfect preservation. Unlike pack/unpack (which works with known formats), freeze/thaw handles arbitrary complex objects — nested references, circular structures, blessed objects.
**Where you see it:** Perl's `Storable`, some game engines (save states), database session storage. Ruby's `Marshal` is conceptually the same as freeze/thaw.
### Serialize / Deserialize
The most generic term. Convert structured data into a flat sequence (a "series") of bytes, then reconstruct it.
```javascript
// JavaScript: JSON serialization
const serialized = JSON.stringify({status: "open", count: 42});
// => '{"status":"open","count":42}'
const deserialized = JSON.parse(serialized);
// => {status: "open", count: 42}
```
**The nuance:** Serialization typically implies data only, not code. JSON can serialize objects but not functions. This is a deliberate safety boundary — you're moving data, not behavior.
**Where you see it:** Every API, every database, every cache, every message queue. JSON, XML, Protocol Buffers, MessagePack, BSON, YAML.
### Bundle
A bundle is code, data, and metadata packed together as a single deployable unit. Unlike serialization (which is flat), a bundle preserves structure and relationships.
```
# macOS app bundle — a directory disguised as a file
MyApp.app/
Contents/
MacOS/MyApp # executable code
Resources/ # images, strings, configs
Info.plist # metadata
```
```perl
# Perl PAR (Perl Archive) — bundle script + dependencies
pp -o myapp script.pl
# Creates a single executable containing the script,
# all its modules, and the Perl interpreter itself
```
**The nuance:** Bundles are self-contained. A pickled object needs the original class definition to unpickle. A bundle carries everything it needs with it.
**Where you see it:** macOS `.app` bundles, Java `.jar` files, Webpack bundles, Docker images (the ultimate bundle — code + OS + dependencies), Android `.apk` files.
### Encapsulation
This term means different things in different contexts:
**Networking:** Wrapping a packet inside another packet. Each protocol layer adds its own header around the payload. An HTTP request is encapsulated in TCP, which is encapsulated in IP, which is encapsulated in an Ethernet frame. Like nested envelopes.
```
[Ethernet Header [IP Header [TCP Header [HTTP Data]]]]
```
**OOP:** Hiding internal state behind a public interface. The data is "encapsulated" inside the object — you can't access it except through methods.
```python
class Timer:
def __init__(self):
self._seconds = 0 # encapsulated — don't touch directly
def start(self): # public interface
...
```
**The nuance:** In networking, encapsulation is about layered wrapping for transport. In OOP, it's about access control. Same word, completely different concepts. The networking meaning is closer to pack/marshal — wrapping data for a journey.
## The Execution Side: Running Stored Code
### Eval
The most direct: take a string of code and execute it at runtime. The code doesn't exist as compiled instructions — it's text that becomes code on demand.
```perl
# Perl: eval a string
my $code = 'print "Hello from eval\n"';
eval $code;
# Perl: eval a block (error handling)
eval {
dangerous_operation();
};
warn $@ if $@;
```
```python
# Python: eval an expression
result = eval("2 + 2") # => 4
# Python: exec a statement
exec("x = 42")
```
```javascript
// JavaScript: eval
eval('console.log("Hello from eval")');
```
**The danger:** Eval is powerful and dangerous. If the string comes from user input, you've given them arbitrary code execution. "Eval is evil" became a mantra for good reason.
**The nuance:** Perl had two kinds of `eval` — string eval (dangerous, interprets a string as code) and block eval (safe, just catches errors). This distinction matters enormously but uses the same keyword.
**Where you see it:** Template engines, configuration systems, REPLs, dynamic code generation, and unfortunately, security vulnerabilities.
### Dispatch
Routing a request to the right piece of code. A dispatch table stores code (usually as references or function pointers) keyed by some identifier, then looks up and runs the right one.
```perl
# Perl: dispatch table (extremely common pattern)
my %dispatch = (
'check_links' => sub { check_all_links() },
'clean_data' => sub { run_data_janitor() },
'qa_monitor' => sub { check_all_pages() },
);
my $action = get_user_command();
$dispatch{$action}->(); # look up and run
```
```python
# Python: same pattern
dispatch = {
'check_links': check_all_links,
'clean_data': run_data_janitor,
}
dispatch[action]()
```
**The nuance:** Dispatch is the act of routing, not the storage. The table stores code; dispatch is what happens when you look it up and call it. In Celery, the worker dispatches tasks — it receives a task name string, looks it up in a registry, and calls the function.
**Where you see it:** Web frameworks (URL routing), event systems, command patterns, Celery task registry, RPC systems, operating system syscall tables.
### Invoke / Invocation
Calling a piece of stored code. More formal than "call" — implies the code was prepared earlier and is now being activated.
```java
// Java: Method invocation via reflection
Method method = obj.getClass().getMethod("processTask");
method.invoke(obj); // invoke the stored method reference
```
```python
# Python: callable invocation
stored_function = get_handler("link_checker")
stored_function() # invoke it
```
**Where you see it:** Java reflection, .NET delegates, event handlers, RPC ("remote procedure invocation").
### Callback
Code passed to another function, to be called back later when something happens. You're handing off code now that will run in the future, at someone else's discretion.
```javascript
// JavaScript: the callback era
fs.readFile('data.json', function(err, data) {
// This code runs LATER, when the file is ready
console.log(data);
});
```
```perl
# Perl: callbacks everywhere
$button->configure(-command => sub { save_file() });
```
**The nuance:** A callback is specifically code that runs in response to something. It's not just stored code — it's stored code with a trigger condition. The caller doesn't know when it will run.
**Where you see it:** Event-driven programming, GUI frameworks, Node.js (before Promises), database query handlers, webhook endpoints.
### Late Binding / Dynamic Dispatch
The code to execute is determined at runtime, not compile time. The variable holds a reference that resolves to different code depending on context.
```python
# Python: late binding via import path
import importlib
module = importlib.import_module("core.tasks")
func = getattr(module, "run_link_checker")
func(user_id) # resolved at runtime
```
This is exactly what Celery does. It stores `"core.tasks.run_link_checker"` as a string in the database. At runtime, it imports the module and calls the function. If you deploy new code, the next invocation gets the new version — because the binding happens late.
**Where you see it:** Plugin systems, dependency injection, OOP polymorphism, Celery task resolution, dynamic imports.
### Thunk
A piece of computation wrapped in a function to delay its execution. You don't run it now — you wrap it up and run it when needed.
```javascript
// JavaScript: a thunk
const fetchUser = () => fetch('/api/user');
// fetchUser is a thunk — it doesn't DO anything yet
// It's computation, deferred
// Later:
const user = await fetchUser(); // NOW it runs
```
```scheme
;; Scheme: thunks are fundamental
(define (lazy-value) (expensive-computation))
;; lazy-value is a thunk — call it when you need the result
```
**The nuance:** A thunk is specifically about deferred evaluation. A callback waits for an event. A thunk waits for you to ask for the value. The distinction is who decides when it runs — with callbacks, the system decides; with thunks, you decide.
**Where you see it:** Redux (redux-thunk middleware), lazy evaluation, memoization, Haskell (everything is a thunk until needed).
### Coderef (Code Reference)
Perl's specific term for a reference to a subroutine stored in a scalar variable. Not a string to eval — a direct reference to compiled code.
```perl
# Perl: code reference
my $checker = \&check_links; # reference to named sub
my $cleaner = sub { clean() }; # anonymous sub (also a coderef)
# Call it
$checker->();
$cleaner->();
# Store in a data structure
my @pipeline = (\&validate, \&transform, \&save);
for my $step (@pipeline) {
$step->($data);
}
```
**The nuance:** A coderef is compiled code, not a string. It's safer than eval because you can't inject arbitrary code — the reference points to something that was already compiled. It's the Perl equivalent of a function pointer in C or a lambda in Python.
**Where you see it:** Perl (extensively), and conceptually in every language that treats functions as first-class values.
## The Transport Layer: Moving Code Between Processes
### Message Passing
Sending a message (containing serialized data or code references) from one process to another. The receiver unpacks and acts on it.
```python
# Celery: message passing via Redis
check_links.delay(user_id=42)
# This serializes the function name + arguments,
# sends them as a message through Redis,
# and a worker picks it up, deserializes, and runs it
```
**Where you see it:** Celery + Redis, Erlang/OTP (message passing is the only way processes communicate), ZeroMQ, RabbitMQ, actor model systems.
### Wire Format / On-the-Wire
The representation of data as it travels between systems. JSON, Protocol Buffers, MessagePack — these are wire formats. The same data might be a Python dict in memory, JSON on the wire, and a JavaScript object at the destination.
```
Python dict → JSON string → HTTP body → JSON string → JavaScript object
(in-memory) (packed) (on wire) (unpacked) (in-memory)
```
**Where you see it:** Every API, every microservice, every RPC system.
### Payload
The actual content being transported, stripped of headers and metadata. In networking, the payload is the data inside the packet after you remove all the protocol headers. In task queues, the payload is the function name and arguments.
```python
# Celery message payload (simplified)
{
"task": "core.tasks.run_link_checker", # what to run
"args": [42], # arguments
"kwargs": {},
"retries": 0
}
```
**Where you see it:** APIs, webhooks, message queues, network packets, JWT tokens (the claims are the payload).
## The Full Journey
When a scheduled agent runs on a platform like AskRobots, every one of these concepts is in play:
1. **Bundle** — The Django app bundles the agent code with its dependencies
2. **Serialize** — django-celery-beat stores the task's import path and schedule as JSON in PostgreSQL
3. **Late binding** — Beat reads the string `"core.tasks.run_link_checker_all_users"` from the database
4. **Marshal** — Beat packs the task name and arguments into a Celery message
5. **Message passing** — The message travels through Redis to an available worker
6. **Dispatch** — The worker looks up the function in its task registry
7. **Unpack/Unmarshal** — Arguments are deserialized
8. **Eval** (loosely) — The worker imports the module and calls the function
9. **Invoke** — The agent's `execute()` method runs
10. **Pickle** — Results are serialized and sent back through Redis
All of those terms from the Perl era — pack, marshal, eval, dispatch, coderef — they're all still here. We just buried them under frameworks so you don't have to think about them anymore. But they're running underneath everything, every time a background agent checks your links at 3 AM.
---
*The vocabulary of computing accumulates like geological layers. We build new abstractions on top, but the old terms persist because they describe real, distinct operations. Pack is not marshal is not serialize, even when they look the same from the outside. The distinctions matter when something goes wrong — and something always goes wrong at the boundary between stored code and running code.*