Based on Ceph v16.2.5
Architecture
The Objecter is how Ceph clients, i.e. OSDC, packages all its requests. It
works asynchonously in a callback fashion, where Context*, which is just Ceph’s
own implementation of std::function, are associated with parameterized requests
Op. The Op is sent to responsible OSD via Messenger, and on receiving reply
Objecter::ms_dispatch() is kicked off by Messenger, where the callbacks will
get executed.
Source
src/osdc/Objecter.(h|cc)
Parameterizing
ObjectOperation is initialized, users may call its member functions to stuff
requests into it.
Handling
![]() |
|---|
OSDC sending request (exemplified by rados_write_full()) |
op_submit()- take
Objecterlock _op_submit_with_budget()- throttling?
- set timeout callback
_op_submit()- calculate target OSDs, and get
OSDSession _send_op_account()- inflight OSD requests counter
inflight_ops++ - pending completions counter
num_in_flight++ - performance counters
- inflight OSD requests counter
_session_op_assign()
assignOSDSessiontoOp_send_op()- convert
ObjecterrequestOptoMessengerrequestMOSDOp - send
MOSDOpthruOSDSession
- convert
- unlock, release
OSDSession
- calculate target OSDs, and get
- take
Messenger on OSD gets the request, and processes it.
![]() |
|---|
| OSD handling request |
Messenger on OSDC listening for incomming responds.
Ending
On successfully return
Objecter::ms_dispatch()
overridingDispatcher- if
CEPH_MSG_OSD_OPREPLYdohandle_osd_op_reply()- some context validity checking
op->trace.event("osd op reply)for zipkin trace- re-
_op_submit()if returnted retry / redirect /-EAGAIN - copy return data field
out_(bl/rval/ec)and callout_handlerin receivedMOSDOpReplyto localObjecter::Opout_blpointers inObjecter::Opwill be forced to point to corresponding receivedOSDOp::outdatarvalandecwill be converted to corresponding host OS error codes from receivedOSDOp::rvalout_handlerwill be executed, all calling parameters are provided by the receivedOSDOp
num_in_flight--if any callback- log
l_osdc_op_reply - (get
OSDSessionlock and)_finish_op()do callback - release
MOSDOpReply
- … (handler for several other types of
Messages) …
- if
On timeout
op_cancel(tid, -ETIMEOUT)num_in_flight--and execute associated callback with-ETIMEOUT_op_cancel_map_check()- erase from
check_latest_map_ops
Ops that were waiting for latest OSD map were pushed into this map during_op_submit()with_send_op_map_check(), if_calc_target()determinescheck_for_latest_mapis true.- erase from
_finish_op()put_op_budget_bytes()accumulate budget to throttler- erase from timeout pool
_session_op_remove()releaseOSDSessionref cntrinflight_ops--(andl_osd_op_activein logger)- release
Opref cntr
The timeout case could be a compact implementation of properly ending and releasing an
Op. We may end up just usingop_cancel(tid, 0)instead of reimplementing everything inhandle_osd_op_reply().

