Skip to content

Move data between Firebase projects using Python

Part 2 of 2 in the series Firebase


Series Name
Firebase

Content


In this post, we will see how to move data between firebase projects using Python . In previous post, we saw Export import process provided by firebase but it does not allow us to transfer data selectively e.g. you can not recursively transfer all data from selected collection to new project. With python script, we will have complete control over which collections or documents to move.

Another use case can be when you want to change or add more data before transferring to new project in Firestore. If you are new to Python, we highly recommend educative.io courses on python which provides interactive platform so you can code right in the educative platform while learning different concepts. Check our review of courses here.

Let’s start.

Firestore – Service Account Keys

First, you need to generate service account keys to use in our python script for both the projects. We will refer our projects as export project and import project.

To generate Service account key, Go to your project settings in Firebase (settings icon next to “Project Overview” in left menu. Click on “Service accounts” tab. Clicking Generate new private key will download the json file with service account key. Make sure not to share it or make it part of source repo. This key allows our python script to access the data in our projects programatically.

Generate Service account key

Download keys for both the projects, rename them as “export_service_account_key.json” and “import_service_account_key.json”.

We will be using Python 3.x version here.

# Create new virtual environment for python3
$ python3 -m venv fs_env

#Activate new environment
$ source fs_env/bin/activate
(fs_env) $

# Check Python version
(fs_env) $ python -V
Python 3.7.4
#install firebase-admin and google-cloud-firestore
(fs_env) $ pip install firebase-admin google-cloud-firestore
....
...
# pip list 
(fs_env) $ pip list
Package                  Version  
------------------------ ---------
CacheControl             0.12.6   
cachetools               4.1.1    
certifi                  2020.6.20
cffi                     1.14.3   
chardet                  3.0.4    
firebase-admin           4.4.0    
google-api-core          1.22.2   
google-api-python-client 1.12.1   
google-auth              1.21.2   
google-auth-httplib2     0.0.4    
google-cloud-core        1.4.1    
google-cloud-firestore   1.9.0    
google-cloud-storage     1.31.0   
google-crc32c            1.0.0    
google-resumable-media   1.0.0    
googleapis-common-protos 1.52.0   
grpcio                   1.32.0   
httplib2                 0.18.1   
idna                     2.10     
msgpack                  1.0.0    
pip                      19.0.3   
protobuf                 3.13.0   
pyasn1                   0.4.8    
pyasn1-modules           0.2.8    
pycparser                2.20     
pytz                     2020.1   
requests                 2.24.0   
rsa                      4.6      
setuptools               40.8.0   
six                      1.15.0   
uritemplate              3.0.1    
urllib3                  1.25.10  

Now create or go to your working folder. Add both project’s service account keys in the folder and add new python script call move_data.py

Firestore Clients Initialisation functions

import firebase_admin


from firebase_admin import credentials, firestore
from firebase_admin.firestore import SERVER_TIMESTAMP

read_objects = []
export_db = None
import_db = None
count = 0
# Name of the firestore collection in export project for reading
export_collection_name = "ig"
# Name of the firestore collection in import project for writing
import_collection_name = "ig"

def batch_data(iterable, n=1):
    l = len(iterable)
    for ndx in range(0, l, n):
        yield iterable[ndx:min(ndx + n, l)]


def init_export_db():
    global export_db
    cred = credentials.Certificate("./export_service_account_key.json")
    primary_app = firebase_admin.initialize_app(cred,name="primary")
    export_db = firestore.client(app=primary_app)
    

def init_import_db():
    global import_db
    cred = credentials.Certificate("./import_service_account_key.json")
    secondary_app = firebase_admin.initialize_app(cred,name="secondary")
    import_db = firestore.client(app=secondary_app)
    

We will use batch_data function to get data in size of n while batch writing in new project.

Firestore – Read Collection from Export Project

Let’s add function to read from export project.

def read_data_from_firestore():
    global read_objects,export_db, export_collection_name,count
    docs = export_db.collection(export_collection_name).stream()
    read_objects = []
    
    for doc in docs:
        #print(u'{} => {}'.format(doc.id, doc.to_dict()))
        obj = doc.to_dict()
        #data_obj = {}
        #data_obj["dialog"] = obj["dialog"]
        #data_obj["movie"] = "Movie Name:"+obj["movie"]
        #read_objects.append(data_obj)
        read_objects.append(obj)
        count+=1

We are reading docs into read_objects. Also, if you need to change or massage or filter the data before storing into new project, this is the good place. Check the code in comments for an example.

Firestore – Write data to the import project

Following function writes data in batch to the new project. Here the batch size is 10, but change it to higher number if required. You can change the collection name in new project for writing by changing import_collection_name value in the script.

Also, for my projects, most of time, I don’t use auto-generated doc id by Firestore. Example given below in comment part shows how you can set document’s id also.

def write_to_firestore():
    global read_objects,import_db, import_collection_name
    print("uploading to "+import_collection_name)
    #print(read_objects) 
    for batched_data in batch_data(read_objects,10):
        batch = import_db.batch()
        for data_item in batched_data:
            #print(data_item)
            #doc_ref = import_db.collection(import_collection_name).document(data_item['g'])
            doc_ref = import_db.collection(import_collection_name).document()
            batch.set(doc_ref, data_item)
        batch.commit()
    
    print("import DB updated")

Now call all the functions to make it work in main.

if __name__ == '__main__':
    count = 0
    init_export_db()
    read_data_from_firestore()
    init_import_db()
    write_to_firestore()
    print("Total "+str(count)+" uploaded")

Now our script is ready. Let’s run it.

(fs_env) $ python move_data.py 
uploading to ig
import DB updated
Total 10 uploaded

Now you can see the data is uploaded to new project successfully.

So this is the end of our script, Let me know if you find it helpful or need any other help.
init_import_db and init_export_db are the most important part for this script to work.

Complete code below :

import firebase_admin

from firebase_admin import credentials, firestore
from firebase_admin.firestore import SERVER_TIMESTAMP

read_objects = []
export_db = None
import_db = None
count = 0
export_collection_name = "ig"
import_collection_name = "ig"

def batch_data(iterable, n=1):
    l = len(iterable)
    for ndx in range(0, l, n):
        yield iterable[ndx:min(ndx + n, l)]


def init_export_db():
    global export_db
    cred = credentials.Certificate("./export_service_account_key.json")
    primary_app = firebase_admin.initialize_app(cred,name="primary")
    export_db = firestore.client(app=primary_app)
    

def init_import_db():
    global import_db
    cred = credentials.Certificate("./import_service_account_key.json")
    secondary_app = firebase_admin.initialize_app(cred,name="secondary")
    import_db = firestore.client(app=secondary_app)
    

def read_data_from_firestore():
    global read_objects,export_db, export_collection_name,count
    docs = export_db.collection(export_collection_name).stream()
    read_objects = []
    
    for doc in docs:
        print(u'{} => {}'.format(doc.id, doc.to_dict()))
        obj = doc.to_dict()
        #data_obj = {}
        #data_obj["dialog"] = obj["dialog"]
        #data_obj["movie"] = "Movie Name:"+obj["movie"]
        #read_objects.append(data_obj)
        read_objects.append(obj)
        count+=1


def write_to_firestore():
    global read_objects,import_db, import_collection_name
    print("uploading to "+import_collection_name)
    #print(read_objects) 
    for batched_data in batch_data(read_objects,10):
        batch = import_db.batch()
        for data_item in batched_data:
            print(data_item)
            doc_ref = import_db.collection(import_collection_name).document(data_item['g'])
            batch.set(doc_ref, data_item)
        batch.commit()
    
    print("import DB updated")

    

if __name__ == '__main__':
    count = 0
    init_export_db()
    read_data_from_firestore()
    init_import_db()
    write_to_firestore()
    print("Total "+str(count)+" uploaded")
    
Series Navigation<< How to move data between projects in Firebase
Published infirebasefirestorePython

2 Comments

  1. Dev Dev

    Thank you very much for this article. It really helped a lot. But can you provide me with code where we can import and export nested collections or documents or subcollections or sub documents. This works fine document when we have collection->document. I want for collection->document->collection->document
    It will be a great help if you could provide that code.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: