Based on Comevussor's answer, I've end up with this code:
@nb.njit(nb.i8[:](nb.i8, nb.i8), fastmath=True)def celdas_vecinas(cell,n): Nt = n**2 # total number of cells x = cell % n; y = cell // n # x,y cell coordinates izq = (x - 1) % n + y * n der = (x + 1) % n + y * n arri = (x % n + (y+1) * n) % Nt aba = (x % n + (y-1) * n) % Nt aba_izq = (izq - n) % Nt aba_der = (der - n) % Nt arri_izq = (izq + n) % Nt arri_der = (der + n) % Nt return np.array([cell, aba_izq, aba, aba_der, izq, der, arri_izq, arri, arri_der])
which works with following performance:
>>> %timeit celdas_vecinas(0,5)567 ns ± 13.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)