How Cocoon iOS app works?

Sept 3 - 2020(App Version: 2.0.2)

This is an infrastructure review of cocoon iOS app. This post is an attempt to explain the behind the scenes of what happens when you use an app like Cocoon.

The app is used for group text, the app boasts a lot of group text features including text, image/video, flight status, audio, and video call. We have been trying to use Cocoon for our family chat instead of WhatsApp for quite some time now.

Download the iOS app App store here. The product website is here

Overall the app's backend is completely built on top of Google Cloud. They use Segment for logging, Sentry for crash analysis, Flightaware for pulling the flight details, google maps SDK for maps and location-related features, Twilio for Video chat.

Signup Flow

Once you download the app and open the app you get to see this home screen

Cocoon Startup Screen

As soon the app is launched you see a bunch of GET requests that initialize all the libraries that are used.

Cocoon uses

When the app is first launched for the first time, it calls and firebase logging endpoints like which collects detailed OS-level information and sends those as a POST request to these Google  hostnames. This includes the phone type, i.e iPhone8, app name which is com.glasswing.cocoon, firebase SDK version information, and so on.

Next call to get initialized and is the segment iOS SDK which returns all the integrations that the Cocoon app uses. Below is a copy of what the return JSON from segment API

GET response from

The use for these settings call is usually to get/set real-time configurations for the app including the API Keys, various flags as you see in the response above.

There are other 3 important calls that Cocoon makes are to its backend. The first one is the Cocoon app config call to this hostname and that gets a whole bunch of configs that are used by the cocoon app itself.

These settings have an extensive SDK version's that the app needs to use mainly for google firebase SDKs, there are some app-level settings such as below that are returned as well. I have included a snippet from its response below.

GET response from

Next is the call to get the 'moods' which return the list of moods that Cooon offers in its in-app messaging. It uses endpoint. Below is a snippet response for that API call.

GET response from

Below is a screenshot of what the in-app display for these moods looks like within Cocoon. Looks like these moods can be added/removed in near-realtime by changing the response from the API.

Cocoon Mood Screen

Another GET call is made to which returns all the color shades that are used when you create a user after you finish Signup.

GET response from

This is the screen that uses this color API in the Cocoon app. Whatever color the backend returns is used here.

Cocoon Color Picker screen

Since cocoon uses Firebase, it authenticates itself with the appid, fid..etc and, the endpoint

/v1/projects/cocoon-beta/installations/ responds with session token and expirations. This is necessary for session management from the firebase perspective.

Every user interaction, like taps, clicks, swipes..etc are all logged extensively to Segment. An Example POST request to the segment endpoint looks like below with all private information redacted

POST request to

It looks like most of the log have the same format with similar metadata, but the `event` and 'type' parameter is what has changed. When I finish the on-boarding flow, the event name was "Completed Onboarding Flow" and so on.

In addition to Cocoon sending the app events to segment, google does its own collection of data which is encrypted and heavily compressed(so, i am not able to see the actual content). Initially, there is a GET request to with the app name(com.glasswing.cocoon), phone type, and a few other metadata. Then there is a POST call to the same hostname but different endpoint with a app_instance_id which returns a bunch of configs.

As you keep using the Cocoon app you see calls made to endpoint with encrypted data. Google actually has a list of fields that they collect, that is here. Doing a basic google search for yielded these and these. It does look like google collects more information about app usage and other metadata.

As you go through the registration process with Cocoon, they require you to verify your phone number(similar to whatsapp..etc) and also collect your email which they do not verify and stored in Mailchimp. Cocoon uses Firebase identity toolkit to send SMS to the new users to verify the identity.

Once the user authentication is complete, Cocoon asks the user about allowing to sending notifications, location, health data to track steps..etc. A unique userid is created as soon as the registration is complete. The location is tracked to the exact lat, long and it looks like the google firebase SDK converts them quickly to the street address. The location context going forward is carried in all the calls between the app and the cocoon API. I am not seeing the location info sent to logs in Segment.


The App has a bunch of features similar to what other group messaging apps have. You can

The below sections will go through each and every one of these features.

Create Cocoon

When you go to create a cocoon, which is a group, you click Get Started, give the Cocoon a name, then select a Nickname which is how you show up to others, then you click Create which creates the Cocoon. A POST request is made to hostname with the name, nickname, and color/picture payload. You can see the sample response below from the endpoint after a cocoon is created. You can see a Cocoon Id being created at this time.

POST response from

Text Messages

Every time a message is posted on a Cocoon by a user, a POST call is made to

You can see the actual Payload that is being sent to the above endpoint.

POST request to

For each and every text message created, a unique messageId is created. The app tells the backend to associate the messageID to a CocoonId and a userId. This helps to associate  which user sends which message.

Share Photo/Video

Cocoon like any other group messaging app allows users to share Photo/Video to the groups. It uses Google Storage to store these files. It stores a thumbnail view as well as the full image/video file that is uploaded.

This update is split into 2 calls to the Cocoon's backend. The first call is a PUT request to endpoint with a unique file name, once the file is successfully uploaded, the app does a POST to endpoint to associate the uploaded file with the user and the cocoon. Below is an example that shows the request payload that Cocoon uses to the backend.

POST request to


Cocoon has a neat way to share Location between users within a group if you wish, this also enables check-in. If you use the check-in feature and search for Locations, Cocoon uses the google maps SDK to search and present you with relevant search results. After you select a place to check-in, a POST request is made to the endpoint. Below shows a sample Request JSON being sent to the backend.

POST request to

Along with the check-in feature, if you click onto the other people within your Cocoon, you can see their location if they have enabled Location. This feature uses the same endpoint

Below is the actual PUT request that is being sent to Cocoon's Backend.

PUT request to

Flight Status

Cocoon has this neat feature that we personally use all the time when our family is traveling. A user would be able to add flight information to the group and the flight updates are sent to the group. Cocoon uses FlightAware API to pull flight information and display it in the app. Cocoon uses endpoint to post the flight information once the user selects a flight.

POST request to

Audio/Video rooms

Nowadays almost all group chat apps should have audio and video capabilities. It's usually harder to implement these natively because of the complexity involved. With Cocoon having these capabilities, I was intrigued to dig in more to see how their implementation looks like.

Cocoon doesn't have a native implementation of the video features. They use Twilio Video API in their implementation.

Twilio has a concept called rooms that can be controlled in the backend. You have the ability to create rooms on the fly from the backend, in this case, if a group wants to meet, Cocoon backend will create a room for that group and when the user clicks join will send the access token to the client. The client uses that access token and using the Twilio SDK will join the room hence making this work.  These calls are made via UDP with WebRTC protocol.

As soon as you click `invite to talk` in your Cocoon, the app calls with the cocoonId, userId and so on. The backend responds with access token, roomId, and the user ID. Below is a sample response from the Cocoon Backend.

GET response from

The Twilio SDK uses these accessToken and the roomId to connect the user to the group and the call starts. roomId is unique as cocoonId here. When a second user wants to connect to the same call, they get a new unique accessToken with the same roomId. Twilio is hosted on AWS, so when you use the video calls feature you are actually connecting to AWS.

Doing a packet capture shows DNS resolution for

endpoint when the app tries to establish the video room. More information on the Twilio Video IP Ranges here

Packet Capture

You can see the UDP connection to which is a connection to Twilio App in US West Coast (Oregon) AWS region.

Dependecy Diagram

Overall Impression