August 25, 2024

Connecting to MongoDB with Elixir

In this article, I would like to explain how to connect to MongoDB in Elixir. MongoDB is an excellent No-SQL database with many interesting possibilities. Unfortunately, the database is somewhat underestimated in the Elixir community and many tutorials like to refer to the Postgresql database.

We will learn

  • how to create a replica set and write a simple Elixir program that connects to the three instances.
  • how to connect to Atlas with a user certificate
  • how to enable the logger
  • how to listen to driver events

Let’s install the community editor of the MongoDB. In the case of MacOS X we can use brew to install it:

    brew tap mongodb/brew
    brew install mongodb-community

Before we set up MongoDB as a replica set, we create a new Elixir project with mix. With the --sup option, we get the basic framework for an OTP application with supervisor tree.

➜  elixir mix new connect --sup
* creating README.md
* creating .formatter.exs
* creating .gitignore
* creating mix.exs
* creating lib
* creating lib/connect.ex
* creating lib/connect/application.ex
* creating test
* creating test/test_helper.exs
* creating test/connect_test.exs

Your Mix project was created successfully.
You can use "mix" to compile it, test it, and more:

    cd connect
    mix test

Run "mix help" for more commands.

We add the mongodb driver as a dependency to the mix.exs file and call mix deps.get to load the driver.

defp deps do
  [
    {:mongodb_driver, "~> 1.4"}
  ]
end

MongoDB - Replica Set

Now we will set up the MongoDB with a relica set. We can start a standalone instance, but a typical environment consists of a replica set of three instances. This enables transactions as well.

We first start three instances on three different ports and then configure the replica set. The replica set is called vesoldo_set. In the case of Mac OS X we will increase the number of file descriptor to avoid running out of file descriptors.

mkdir -p tmp/db1
mkdir -p tmp/db2
mkdir -p tmp/db3
ulimit -S -n 2048

mongod --wiredTigerCacheSizeGB=1 --fork --dbpath tmp/db1 --logpath tmp/db1/log --port 27017 --bind_ip 0.0.0.0 --replSet vesoldo_set
mongod --wiredTigerCacheSizeGB=1 --fork --dbpath tmp/db2 --logpath tmp/db2/log --port 27018 --bind_ip 0.0.0.0 --replSet vesoldo_set
mongod --wiredTigerCacheSizeGB=1 --fork --dbpath tmp/db3 --logpath tmp/db3/log --port 27019 --bind_ip 0.0.0.0 --replSet vesoldo_set

If the servers are running without any error then we can configure the replica set. We use the mongosh tool to log in and configure the replica set:

mongosh 'mongodb://127.0.0.1:27017'

In the mongo shell we start the replica set using the rs.initiate command:

rs.initiate({_id: "versoldo_set", members: [{_id: 0, host: "127.0.0.1:27017"}, {_id: 1, host: "127.0.0.1:27018"}, {_id: 2, host: "127.0.0.1:27019"}]})

If the shell shows something like:

{
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1724679121, i: 1 }),
    signature: {
      hash: Binary.createFromBase64('AAAAAAAAAAAAAAAAAAAAAAAAAAA=', 0),
      keyId: Long('0')
    }
  },
  operationTime: Timestamp({ t: 1724679121, i: 1 })
}

then we can quid the shell and our replica set is running. For more information about replica set you can take a look at the official documentation.

Connecting to the local instance

Now everything is prepared to connect to our replica set from Elixir. You can use the interactive Elixir shell and start the so-called topology process:

iex -S mix

Compiling 2 files (.ex)
Generated connect app
Interactive Elixir (1.17.2) - press Ctrl+C to exit (type h() ENTER for help)
iex [15:39 :: 1] >  {:ok, top} = Mongo.start_link(url: "mongodb://localhost:27017/vesoldo")
{:ok, #PID<0.1111.0>}

Now we have a pid of the topology process. This process manage all the details of the replica set. Let’s insert some data and fetch the data from the server.

Mongo.insert_one(top, "dogs", %{name: "Greta"})
{:ok, %Mongo.InsertOneResult{acknowledged: true, inserted_id: #BSON.ObjectId<66cc86e24ae06e2a2c893b4b>}}
Mongo.insert_one(top, "dogs", %{name: "Tom"})
{:ok, %Mongo.InsertOneResult{acknowledged: true, inserted_id: #BSON.ObjectId<66cc86f04ae06e2a2c2d4b58>}}
Mongo.insert_one(top, "dogs", %{name: "Gustav"})
{:ok, %Mongo.InsertOneResult{acknowledged: true, inserted_id: #BSON.ObjectId<66cc86ff4ae06e2a2c579be9>}}

That was pretty easy. Now we try to fetch the data by using Mongo.find which returns a stream, so we pipe the stream to the Enum.to_list function.

Mongo.find(top, "dogs", %{}) |> Enum.to_list()
 [
    %{"_id" => #BSON.ObjectId<66cc86e24ae06e2a2c893b4b>, "name" => "Greta"},
    %{"_id" => #BSON.ObjectId<66cc86f04ae06e2a2c2d4b58>, "name" => "Tom"},
    %{"_id" => #BSON.ObjectId<66cc86ff4ae06e2a2c579be9>, "name" => "Gustav"}
  ]

Using a supervisor

We currently use a local pid of the process. If we add a worker process into our supervisor found in the application.ex file:

children = [
  {Mongo, [name: :vesoldo_db, url: "mongodb://localhost:27017/vesoldo"]}
]

and restart our application, then we can use the :vesoldo_db atom instead of a process pid. This allows us to call the Mongo-function directly:

Mongo.find(:vesoldo_db, "dogs", %{}) |> Enum.to_list()
[
  %{"_id" => #BSON.ObjectId<66cc86e24ae06e2a2c893b4b>, "name" => "Greta"},
  %{"_id" => #BSON.ObjectId<66cc86f04ae06e2a2c2d4b58>, "name" => "Tom"},
  %{"_id" => #BSON.ObjectId<66cc86ff4ae06e2a2c579be9>, "name" => "Gustav"}
]

Configure logging

The MongoDB driver comes with a helpful logging but we need enable it first. If we add this following line to our config.exs and restart the application:

config :mongodb_driver, log: true

The driver will log the query sent to the server and the duration of the response. In this case, the find command was sent and it took about 1.1ms to receive the response. |

Mongo.find(:vesoldo_db, "dogs", %{}) |> Enum.to_list()

19:38:53.212 [info] CMD find "dogs" [] db=1.1ms
[
  ...
]

If we add a new dog to our collection, the driver will log something like:

Mongo.insert_one(:vesoldo_db, "dogs", %{name: "Bodo"})

19:44:00.698 [info] CMD insert "dogs" [documents: [[_id: #BSON.ObjectId<66ccbee04ae06e632973c553>, name: "Bodo"]]] db=15.5ms
{:ok, %Mongo.InsertOneResult{acknowledged: true, inserted_id: #BSON.ObjectId<66ccbee04ae06e632973c553>}}

Listen to events

The driver sends different types of events when something has changed in the topology or when commands are sent to the database. We can subscribe to these and make the driver’s activity visible. To do this, we implement a simple GenServer.

defmodule EventHandler do

  require Logger

  use GenServer

  @me __MODULE__

  def start_link(_args) do
    GenServer.start_link(@me, :no_args, name: @me)
  end

  @impl true
  def init(:no_args) do
    Registry.register(:events_registry, :topology, [])
    {:ok, []}
  end

  @impl true
  def handle_info({:broadcast, :topology, %Mongo.Events.TopologyDescriptionChangedEvent{} = event}, state) do
    Logger.info("Event: #{inspect event.new_description.type}")
    {:noreply, state}
  end

  def handle_info(_message, state) do
    {:noreply, state}
  end

end

In the application.ex file we extend the list of the children:

  children = [
    {EventHandler, []},
    {Mongo, opts},
  ]

After restarting the application we can observer the updates of the topology:

15:40:45.098 [info] Event: :unknown
15:40:45.098 [info] Event: :unknown
15:40:45.098 [info] Event: :replica_set_with_primary

Monitor processes monitor the individual servers in the background. If, for example, a new server is added, the topology is adjusted and an event is broadcasted.

Interesting use cases can be derived from thisfeature: If the primary server fails, we can prevent all write operations until a new primary server has been found.

Now we will learn how to connect using a X.509 user certificate to a database provided by the cloud provider Atlas.

Connecting to Atlas

Preparations: After we have set up a simple database via the Atlas dashboard, we need to enter our IP address under Network Access. Under Database Access we create a user with read and write access rights and select the X.509 Authentication Method.

While creating the new user we can download the user certificate (X509-cert-5985849297881971623.pem). Now we can following the description from the readme file of the driver. First, we extract the user name from the certificate

openssl x509 -in X509-cert-5985849297881971623.pem -inform PEM -subject -nameopt RFC2253

> CN=user

We set the username attribute to CN=user. The auth_mechanism must be :x509. The configuration looks like this:

opts = [
    url: "mongodb+srv://cluster0.169dm4y.mongodb.net/vesoldo",
    ssl: true,
    username: "CN=user",
    auth_mechanism: :x509,
    ssl_opts: [
      verify: :verify_peer,
      cacertfile: :public_key.cacerts_get(),
      certfile: ~c"./X509-cert-5985849297881971623.pem",
      customize_hostname_check: [
        match_fun:
          :public_key.pkix_verify_hostname_match_fun(:https)
      ]
    ]]

Mongo.start_link(opts)

In the ssl_opts we need to specify some options for the underlying ssl module of Erlang. Here we specify the path to the certificate as a charlist. The function :public_key.cacerts_get() returns the trusted CA certificates of the operation system, because we use the verify_peer option.

The setting of the attribute customize_hostname_check allows wildcard certificate matching as specified by the HTTP standard.

Now we can start our application and the driver will connect to the Atlas cluster.

Conclusion

We have seen that various connection methods exist. On the one hand, you can log in directly without a user and on the other hand, you can perform an encrypted login with a user certificate.

In addition to the usual log outputs, the driver also provides telemetry and events that can be used to monitor communication, performance and the use of the database.