Asterisk
  1. Asterisk
  2. ASTERISK-20175

Security Vulnerability: denial of service attack through exploitation of device state caching

    Details

    • Regression:
      No

      Description

      From Russell:

      I have been working with someone on some performance issues with their
      Asterisk cluster that uses distributed device state. One of the
      problems that we identified was that the size of the device state
      cache was growing out of control. To view the cache, you can do:

          *CLI> event dump cache DeviceState
      

      In particular, the states that were causing the problem on these
      systems were things like:

          Local/12341234@whatever
          DAHDI/i8/12341234
      

      Certain "device states" like this are useless to cache. Imagine an
      outbound call center that uses Local channels in their dialplan and
      PRIs for doing outbound calls. They get entries in the cache for
      every number they dial. Ouch. That's a bug that needs to be
      addressed and I'm not quite sure how to fix it in a good generic way
      yet. However, that's not the vulnerability. It's just the background
      that led me to the vulnerability.

      I started thinking about how far this problem really reaches. I
      wondered, can I remotely grow the cache, causing performance problems
      and eventually running out of RAM? Unfortunately, yes. I have
      verified this with SIP. I imagine the same issue exists with IAX2.

      In chan_sip, if you allow anonymous calls, the channel name is based
      on the domain in the From header. I verified this vulnerability by
      doing the following:

         ; sip.conf
      
          [someserver]
          type=peer
          host=someserver.com
          fromdomain=example.com
          fromuser=foo
      

      I then used a call file:

          Channel: SIP/foo@someserver
          CallerID: "My Name" <1111111>
          Application: Playback
          Data: beep
      

      The domain in the From header should be "example.com". The channel
      name on the remote server should be "SIP/example.com-<something>". An
      entry will be added to the cache for "SIP/example.com". This means
      that I can very easily continue to send calls with different domains
      and fill up the cache.

      The public server that I tested this against happened to be running
      Asterisk 10. I believe that this affects all versions that have the
      device state cache, which would be 1.6.something and up.

      This is a nasty problem and I'm not sure what the fix should be. It's
      an architectural problem. The cache needs to only consist of things
      that are defined locally, and not things that are dynamically
      generated, but there's not a good generic way to determine that given
      a "device" name. I'd be happy to brainstorm with others on this.

      While the original report came from me, I'd like to credit Leif Madsen
      and Joshua Colp for their assistance with verifying the vulnerability.

      Thanks,


      Russell Bryant

        Issue Links

          Activity

          Hide
          Kinsey Moore added a comment -

          Consumers for further state distribution:
          res_xmpp
          res_jabber
          res_corosync

          Direct consumers of device state:
          app_queue
          CCSS
          pbx hints
          devicestate

          All of these seem to hook device state unconditionally except CCSS which hooks information for specific devices as they require CCSS.

          Show
          Kinsey Moore added a comment - Consumers for further state distribution: res_xmpp res_jabber res_corosync Direct consumers of device state: app_queue CCSS pbx hints devicestate All of these seem to hook device state unconditionally except CCSS which hooks information for specific devices as they require CCSS.
          Hide
          Kinsey Moore added a comment -

          It also appears that main/devicestate.c is the only consumer of cached device state events.

          Show
          Kinsey Moore added a comment - It also appears that main/devicestate.c is the only consumer of cached device state events.
          Hide
          Kinsey Moore added a comment -

          I finally got a chance to talk with Russell and this is going to be pretty nasty to fix. Only the channel driver that creates the channel can know whether its state should be cached. This information has to live with the channel (probably a flag) as its state changes so that created events can be marked as cacheable. This flag already exists on ast_event_ref instead of the event itself (see _ast_event_queue in main/event.c) but is not used/exposed so as to be easily usable and will require a small API change (or may be easier to put on the event itself via a new IE). Device state changes that are distributed must also have this cacheable flag so that remote systems can know whether the new state should be cached or discarded after any receivers have taken appropriate action. I have not yet determined if this can be backwards compatible with existing event state distribution architecture.

          Show
          Kinsey Moore added a comment - I finally got a chance to talk with Russell and this is going to be pretty nasty to fix. Only the channel driver that creates the channel can know whether its state should be cached. This information has to live with the channel (probably a flag) as its state changes so that created events can be marked as cacheable. This flag already exists on ast_event_ref instead of the event itself (see _ast_event_queue in main/event.c) but is not used/exposed so as to be easily usable and will require a small API change (or may be easier to put on the event itself via a new IE). Device state changes that are distributed must also have this cacheable flag so that remote systems can know whether the new state should be cached or discarded after any receivers have taken appropriate action. I have not yet determined if this can be backwards compatible with existing event state distribution architecture.
          Hide
          Matt Jordan added a comment -

          So, first you should probably keep in mind that whatever is done has to be done in the context of 1.8+. _ast_event_queue doesn't exist in 1.8, and the ast_event_ref object does not have a cache attribute in 1.8.

          That being said, that doesn't mean that can't be backported to 1.8.

          Something Kevin suggested was to think about making this configurable in each channel driver. The default would be to 'save state' for each device, but then allow for 'guest' devices to not have their state saved, as well as any particular configurable device. In the case of local channels, you'd probably never have their device state cached.

          Ideally then, each device would mark whether or not they want their event to be cached when they raise the event. This would allow a system administrator the ability to configure the system such that they can prevent the situation Russell ran into, while keeping the current behavior (cache stuff) if they so desire.

          As far as the distributed architecture goes, if we have to convey the cachce information in the event, then it won't be 'purely' backwards compatible. However, if all we've done is embed a new IE into the event, then 'old' systems should be okay, since they can pull information out of the event based on the identifier of each IE.

          Show
          Matt Jordan added a comment - So, first you should probably keep in mind that whatever is done has to be done in the context of 1.8+. _ast_event_queue doesn't exist in 1.8, and the ast_event_ref object does not have a cache attribute in 1.8. That being said, that doesn't mean that can't be backported to 1.8. Something Kevin suggested was to think about making this configurable in each channel driver. The default would be to 'save state' for each device, but then allow for 'guest' devices to not have their state saved, as well as any particular configurable device. In the case of local channels, you'd probably never have their device state cached. Ideally then, each device would mark whether or not they want their event to be cached when they raise the event. This would allow a system administrator the ability to configure the system such that they can prevent the situation Russell ran into, while keeping the current behavior (cache stuff) if they so desire. As far as the distributed architecture goes, if we have to convey the cachce information in the event, then it won't be 'purely' backwards compatible. However, if all we've done is embed a new IE into the event, then 'old' systems should be okay, since they can pull information out of the event based on the identifier of each IE.
          Hide
          Kinsey Moore added a comment - - edited

          List of items to complete:

          • Identify all event generation that would need to determine cachability of the event being generated and determine in what cases these events are cachable. (needs a lot of research)
          • Per-channel-driver implementation of options to change caching behavior with cachability flag on channel.
            • Maybe this would be better as a global option to make all events cachable vs some not cachable? (would still require per-channel flag)
          • Change internal generation/usage of cache flag to be an IE on the ast_event instead of a flag on the ast_event_ref.
          • Update all instances of event generation to use this IE appropriately in conjunction with the flag on the channel or the global option, whichever is chosen.
            • Events without the IE should be considered cachable since they would be coming from a legacy system that expects them to be cachable. Otherwise, all events should have the cachability IE.
          • Change distributed generation/usage of events to serialize/deserialize the IE describing cachability.
            • res_ais/res_corosync: Transparent since the event is sent as binary data. An unknown field in the event should not cause problems to legacy Asterisk systems running res_ais/res_corosync.
            • res_xmpp/res_jabber: The persist_items configuration field is a per-subscription configuration and not a per-event configuration. This will be inserted as an item in aji_build_publish_skeleton.
              • The cachability item should always come last so as not to disturb the parsing of legacy implementations. This should assume cachability by default and the additional information should negate cachability to prevent incorrect interpretation of events from legacy systems.

          Note 1: There do not appear to be any event comparison functions in event.h that would choke on an additional/unknown IE.

          Note 2: Porting from 1.8 forward should not be much of an issue, but the existing cachability flags should be removed from the ast_event_ref in 10, 11, and trunk and the differences between 1.8 and 10 should be smoothed out as far as ast_event_queue vs ast_event_queue_and_cache.

          Does this need to be done for core, core+extended, or all modules?

          Edit: clarification on configuration option as per-subscription and not per-event as published.

          Show
          Kinsey Moore added a comment - - edited List of items to complete: Identify all event generation that would need to determine cachability of the event being generated and determine in what cases these events are cachable. (needs a lot of research) Per-channel-driver implementation of options to change caching behavior with cachability flag on channel. Maybe this would be better as a global option to make all events cachable vs some not cachable? (would still require per-channel flag) Change internal generation/usage of cache flag to be an IE on the ast_event instead of a flag on the ast_event_ref. Update all instances of event generation to use this IE appropriately in conjunction with the flag on the channel or the global option, whichever is chosen. Events without the IE should be considered cachable since they would be coming from a legacy system that expects them to be cachable. Otherwise, all events should have the cachability IE. Change distributed generation/usage of events to serialize/deserialize the IE describing cachability. res_ais/res_corosync: Transparent since the event is sent as binary data. An unknown field in the event should not cause problems to legacy Asterisk systems running res_ais/res_corosync. res_xmpp/res_jabber: The persist_items configuration field is a per-subscription configuration and not a per-event configuration. This will be inserted as an item in aji_build_publish_skeleton. The cachability item should always come last so as not to disturb the parsing of legacy implementations. This should assume cachability by default and the additional information should negate cachability to prevent incorrect interpretation of events from legacy systems. Note 1: There do not appear to be any event comparison functions in event.h that would choke on an additional/unknown IE. Note 2: Porting from 1.8 forward should not be much of an issue, but the existing cachability flags should be removed from the ast_event_ref in 10, 11, and trunk and the differences between 1.8 and 10 should be smoothed out as far as ast_event_queue vs ast_event_queue_and_cache. Does this need to be done for core, core+extended, or all modules? Edit: clarification on configuration option as per-subscription and not per-event as published.
          Hide
          Kinsey Moore added a comment - - edited

          Event-related areas needing evaluation:

          • Code that uses ast_event_queue_and_cache with AST_EVENT_DEVICE_STATE[_CHANGE]
            • res_jabber/res_xmpp
            • res_corosync/res_ais
            • devicestate.c
          • Code that uses ast_event_subscribe[_new] with any cachable event types
            • res_jabber/res_xmpp
            • res_corosync/res_ais
            • devicestate.c
          • Code that consumes cached AST_EVENT_DEVICE_STATE[_CHANGE] events
            • devicestate.c
          • Code that uses ast_devstate_changed or ast_devstate_changed_literal
            • channels: dahdi, sip, agent, iax2, skinny
            • main: channel, devicestate(needs to pass through the new cache parameter), features
            • apps: confbridge, meetme
            • res: calendar
            • funcs: devstate
          Show
          Kinsey Moore added a comment - - edited Event-related areas needing evaluation: Code that uses ast_event_queue_and_cache with AST_EVENT_DEVICE_STATE [_CHANGE] res_jabber/res_xmpp res_corosync/res_ais devicestate.c Code that uses ast_event_subscribe [_new] with any cachable event types res_jabber/res_xmpp res_corosync/res_ais devicestate.c Code that consumes cached AST_EVENT_DEVICE_STATE [_CHANGE] events devicestate.c Code that uses ast_devstate_changed or ast_devstate_changed_literal channels: dahdi, sip, agent, iax2, skinny main: channel, devicestate(needs to pass through the new cache parameter), features apps: confbridge, meetme res: calendar funcs: devstate

            People

            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development